llvm-project

Commit Graph

Author	SHA1	Message	Date
Stanislav Mekhanoshin	f66a43c11a	Process gep (phi ptr1, ptr2) in SROA Differential Revision: https://reviews.llvm.org/D79218	2020-05-29 13:05:51 -07:00
Florian Hahn	01f999ae88	[SCCP] Switch to widen at PHIs, stores and call edges. Currently SCCP does not widen PHIs, stores or along call edges (arguments/return values), but on operations that directly extend ranges (like binary operators). This means PHIs, stores and call edges are not pessimized by widening currently, while binary operators are. The main reason for widening operators initially was that opting-out for certain operations was more straight-forward in the initial implementation (and it did not matter too much, as range support initially was only implemented for a very limited set of operations. During the discussion in D78391, it was suggested to consider flipping widening to PHIs, stores and along call edges. After adding support for tracking the number of range extensions in ValueLattice, limiting the number of range extensions per value is straight forward. This patch introduces a MaxWidenSteps option to the MergeOptions, limiting the number of range extensions per value. For PHIs, it seems natural allow an extension for each (active) incoming value plus 1. For the other cases, a arbitrary limit of 10 has been chosen initially. It would potentially make sense to set it depending on the users of a function/global, but that still needs investigating. This potentially leads to more state-changes and longer compile-times. The results look quite promising (MultiSource, SPEC): Same hash: 179 (filtered out) Remaining: 58 Metric: sccp.IPNumInstRemoved Program base widen-phi diff test-suite...ks/Prolangs-C/agrep/agrep.test 58.00 82.00 41.4% test-suite...marks/SciMark2-C/scimark2.test 32.00 43.00 34.4% test-suite...rks/FreeBench/mason/mason.test 6.00 8.00 33.3% test-suite...langs-C/football/football.test 104.00 128.00 23.1% test-suite...cations/hexxagon/hexxagon.test 36.00 42.00 16.7% test-suite...CFP2000/177.mesa/177.mesa.test 214.00 249.00 16.4% test-suite...ngs-C/assembler/assembler.test 14.00 16.00 14.3% test-suite...arks/VersaBench/dbms/dbms.test 10.00 11.00 10.0% test-suite...oxyApps-C++/miniFE/miniFE.test 43.00 47.00 9.3% test-suite...ications/JM/ldecod/ldecod.test 179.00 195.00 8.9% test-suite...CFP2006/433.milc/433.milc.test 249.00 265.00 6.4% test-suite.../CINT2000/175.vpr/175.vpr.test 98.00 104.00 6.1% test-suite...peg2/mpeg2dec/mpeg2decode.test 70.00 74.00 5.7% test-suite...CFP2000/188.ammp/188.ammp.test 71.00 75.00 5.6% test-suite...ce/Benchmarks/PAQ8p/paq8p.test 111.00 117.00 5.4% test-suite...ce/Applications/Burg/burg.test 41.00 43.00 4.9% test-suite...000/197.parser/197.parser.test 66.00 69.00 4.5% test-suite...tions/lambda-0.1.3/lambda.test 23.00 24.00 4.3% test-suite...urce/Applications/lua/lua.test 301.00 313.00 4.0% test-suite...TimberWolfMC/timberwolfmc.test 76.00 79.00 3.9% test-suite...lications/ClamAV/clamscan.test 991.00 1030.00 3.9% test-suite...plications/d/make_dparser.test 53.00 55.00 3.8% test-suite...fice-ispell/office-ispell.test 83.00 86.00 3.6% test-suite...lications/obsequi/Obsequi.test 28.00 29.00 3.6% test-suite.../Prolangs-C/bison/mybison.test 56.00 58.00 3.6% test-suite.../CINT2000/254.gap/254.gap.test 170.00 176.00 3.5% test-suite.../Applications/lemon/lemon.test 30.00 31.00 3.3% test-suite.../CINT2000/176.gcc/176.gcc.test 1202.00 1240.00 3.2% test-suite...pplications/treecc/treecc.test 79.00 81.00 2.5% test-suite...chmarks/MallocBench/gs/gs.test 357.00 366.00 2.5% test-suite...eeBench/analyzer/analyzer.test 103.00 105.00 1.9% test-suite...T2006/445.gobmk/445.gobmk.test 1697.00 1724.00 1.6% test-suite...006/453.povray/453.povray.test 1812.00 1839.00 1.5% test-suite.../Benchmarks/Bullet/bullet.test 337.00 342.00 1.5% test-suite.../CINT2000/252.eon/252.eon.test 426.00 432.00 1.4% test-suite...T2000/300.twolf/300.twolf.test 214.00 217.00 1.4% test-suite...pplications/oggenc/oggenc.test 244.00 247.00 1.2% test-suite.../CINT2006/403.gcc/403.gcc.test 4008.00 4055.00 1.2% test-suite...T2006/456.hmmer/456.hmmer.test 175.00 177.00 1.1% test-suite...nal/skidmarks10/skidmarks.test 430.00 434.00 0.9% test-suite.../Applications/sgefa/sgefa.test 115.00 116.00 0.9% test-suite...006/447.dealII/447.dealII.test 1082.00 1091.00 0.8% test-suite...6/482.sphinx3/482.sphinx3.test 141.00 142.00 0.7% test-suite...ocBench/espresso/espresso.test 152.00 153.00 0.7% test-suite...3.xalancbmk/483.xalancbmk.test 4003.00 4025.00 0.5% test-suite...lications/sqlite3/sqlite3.test 548.00 551.00 0.5% test-suite...marks/7zip/7zip-benchmark.test 5522.00 5551.00 0.5% test-suite...nsumer-lame/consumer-lame.test 208.00 209.00 0.5% test-suite...:: External/Povray/povray.test 1556.00 1563.00 0.4% test-suite...000/186.crafty/186.crafty.test 298.00 299.00 0.3% test-suite.../Applications/SPASS/SPASS.test 2019.00 2025.00 0.3% test-suite...ications/JM/lencod/lencod.test 8427.00 8449.00 0.3% test-suite...6/464.h264ref/464.h264ref.test 6797.00 6813.00 0.2% test-suite...6/471.omnetpp/471.omnetpp.test 431.00 430.00 -0.2% test-suite...006/450.soplex/450.soplex.test 446.00 447.00 0.2% test-suite...0.perlbench/400.perlbench.test 1729.00 1727.00 -0.1% test-suite...000/255.vortex/255.vortex.test 3815.00 3819.00 0.1% Reviewers: efriedma, nikic, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79036	2020-05-29 11:59:17 +01:00
Philip Reames	a0d2fd4a1f	[Statepoint] Sink actual_args and gc_args to GCStatepointInst [NFC] These are the two operand sets which are expected to survive more than another week or so. Instead of bothering to update the deopt and gc-transition operands, we'll just wait until those are removed and delete the code. For those following along, this is likely to be the last (major) change in this sequence for about a week. I want to wait until all of this has been merged downstream to ensure I haven't introduced any bugs (and migrate some downstream code to the new interfaces). Once that's done, we should be able to delete Statepoint/ImmutableStatepoint without too much work.	2020-05-28 13:51:59 -07:00
aartbik	f719e7d9e7	[llvm] [MatrixIntrinsics] Add row-major support for llvm.matrix.transpose Summary: Only column-major was supported so far. This adds row-major support as well. Note that we probably also want very efficient SIMD implementations for the various target platforms. Bug: https://bugs.llvm.org/show_bug.cgi?id=46085 Reviewers: nicolasvasilache, reidtatge, bkramer, fhahn, ftynse, andydavis1, craig.topper, dcaballe, mehdi_amini, anemet Reviewed By: fhahn Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80673	2020-05-28 12:13:32 -07:00
Philip Reames	587fa99cfd	Default to generating statepoints with deopt and gc-transition bundles if needed Continues from D80598. The key point of the change is to default to using operand bundles instead of the inline length prefix argument lists for statepoint nodes. An important subtlety to note is that the presence of a bundle has semantic meaning, even if it is empty. As such, we need to make a somewhat deeper change to the interface than is first obvious. Existing code treats statepoint deopt arguments and the deopt bundle operands differently during inlining. The former is ignored (resulting in caller state being dropped), the later is merged. We can't preserve the old behaviour for calls with deopt fed to RS4GC and then inlining, but we can avoid the no-deopt case changing. At least in internal testing, that seem to be the important one. (I'd argue the "stop merging after RS4GC" behaviour for the former was always "unexpected", but that the behaviour for non-deopt calls actually make sense.) Differential Revision: https://reviews.llvm.org/D80674	2020-05-28 10:14:23 -07:00
Matt Arsenault	d6671ee90c	InferAddressSpaces: Handle ptrmask intrinsic This one is slightly odd since it counts as an address expression, which previously could never fail. Allow the existing TTI hook to return the value to use, and re-use it for handling how to handle ptrmask. Handles the no-op addrspacecasts for AMDGPU. We could probably do something better based on analysis of the mask value based on the address space, but leave that for now.	2020-05-28 10:04:02 -04:00
Kazu Hirata	c4990a03c6	[JumpThreading] Use emplace_back instead of push_back (NFC) Summary: This patch replaces push_back with emplace_back where appropriate. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80688	2020-05-27 22:31:23 -07:00
Philip Reames	87bea912c2	[Statepoint] Replace uses of isX functions with idiomatic isa<X> Now that all of the statepoint related routines have classes with isa support, let's cleanup. I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.	2020-05-27 18:32:28 -07:00
Layton Kifer	2bf3fe9b6d	[TRE] Allow elimination when the returned value is non-constant Currently we can only eliminate call return pairs that either return the result of the call or a dynamic constant. This patch removes that limitation. Differential Revision: https://reviews.llvm.org/D79660	2020-05-27 16:55:03 -07:00
Daniil Suchkov	706b22e3e4	[SimpleLoopUnswitch] Drop uses of instructions before block deletion Currently if instructions defined in a block are used in unreachable blocks and SimpleLoopUnswitch attempts deleting the block, it triggers assertion "Uses remain when a value is destroyed!". This patch fixes it by replacing all uses of instructions from BB with undefs before BB deletion. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D80551	2020-05-27 18:25:18 +07:00
Stanislav Mekhanoshin	42725aeed8	Process gep (select ptr1, ptr2) in SROA Differential Revision: https://reviews.llvm.org/D79217	2020-05-26 12:56:02 -07:00
Sam Parker	871556a494	[CostModel] Unify Intrinsic Costs. Recommitting most of the remaining changes from `259eb619ff`, but excluding the call to getUserCost from getInstructionThroughput. Though there's still no test changes, I doubt that this is an NFC... With the two getIntrinsicInstrCosts folded into one, now fold in the scalar/code-size orientated getIntrinsicCost. The remaining scalar intrinsics were memcpy, cttz and ctlz which now have special handling in the BasicTTI implementation. This had required a change in the AMDGPU backend for fabs as it should always be 'free'. I've also changed the X86 backend to return the BaseT implementation when the CostKind isn't RecipThroughput. Differential Revision: https://reviews.llvm.org/D80012	2020-05-26 09:48:26 +01:00
Marek Kurdej	bc93c2d72e	[Transforms] Fix typos. NFC	2020-05-25 22:34:08 +02:00
Simon Pilgrim	8b4ecafee6	InstructionSimplify.h - remove unnecessary includes. NFC. Remove unused User.h include. Replace SetVector.h with forward declaration. Sort the forward declarations + remove FastMathFlags (defined in Operator.h). Fix implicit SetVector.h dependency in LowerConstantIntrinsics.cpp.	2020-05-25 13:45:03 +01:00
Matt Arsenault	cdd006eec9	SimplifyCFG: Clean up optforfuzzing implementation This should function as any other SimplifyCFGOption rather than having the transform check and specially consider the attribute itself.	2020-05-23 13:49:50 -04:00
Craig Topper	7392820f98	[Align] Remove operations on MaybeAlign that asserted that it had a defined value. If the caller needs to reponsible for making sure the MaybeAlign has a value, then we should just make the caller convert it to an Align with operator*. I explicitly deleted the relational comparison operators that were being inherited from Optional. It's unclear what the meaning of two MaybeAligns were one is defined and the other isn't should be. So make the caller reponsible for defining the behavior. I left the ==/!= operators from Optional. But now that exposed a weird quirk that ==/!= between Align and MaybeAlign required the MaybeAlign to be defined. But now we use the operator== from Optional that takes an Optional and the Value. Differential Revision: https://reviews.llvm.org/D80455	2020-05-22 21:54:28 -07:00
Sam Parker	259eb619ff	Revert "[CostModel] Unify Intrinsic Costs." This reverts commit `de71def3f5`. This is causing some very large changes, so I'm first going to break this patch down and re-commit in parts.	2020-05-21 12:50:24 +01:00
Sam Parker	de71def3f5	[CostModel] Unify Intrinsic Costs. With the two getIntrinsicInstrCosts folded into one, now fold in the scalar/code-size orientated getIntrinsicCost. This involved sinking cost of the TTIImpl into the base implementation, as it performs no target checks. The opcodes remaining were memcpy, cttz and ctlz which now have special handling in the BasicTTI implementation. getInstructionThroughput can now directly return the result of getUserCost. This had required a change in the AMDGPU backend for fabs and its always 'free'. I've also changed the X86 backend to return '1' for any intrinsic when the CostKind isn't RecipThroughput. Though this intended to be a non-functional change, there are many paths being combined here so I would be very surprised if this didn't have an effect. Differential Revision: https://reviews.llvm.org/D80012	2020-05-21 07:38:25 +01:00
Yevgeny Rouban	8138487468	[BrachProbablityInfo] Set edge probabilities at once and fix calcMetadataWeights() Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities. Related bug is fixed in BranchProbabilityInfo::calcMetadataWeights(). Changing unreachable branch probabilities to raw(1) and distributing the rest (oldProbability - raw(1)) over the reachable branches could introduce total probability inaccuracy bigger than 1/numOfBranches. Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396	2020-05-21 12:52:37 +07:00
Florian Hahn	bcbd26bfe6	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. This patch was originally committed as `b8a3c34eee`, but broke the modules build, as LoopAccessAnalysis was using the Expander. The code-gen part of LAA was moved to lib/Transforms recently, so this patch can be landed again. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-05-20 10:53:40 +01:00
Nikita Popov	5fae613a4f	[LVI] Don't require DominatorTree in LVI (NFC) After D76797 the dominator tree is no longer used in LVI, so we can remove it as a pass dependency, and also get rid of the dominator tree enabling/disabling logic in JumpThreading. Apart from cleaning up the code, this also clarifies LVI cache consistency, in that the LVI cache can no longer depend on whether the DT was or wasn't enabled due to pending DT updates at any given time. Differential Revision: https://reviews.llvm.org/D76985	2020-05-19 20:21:46 +02:00
Nikita Popov	736db2f710	[Loads] Require Align in isSafeToLoadUnconditionally() (NFC) Now that load/store have required alignment, accept Align here. This also avoids uses of getPointerElementType(), which is incompatible with opaque pointers.	2020-05-18 20:50:35 +02:00
Nikita Popov	52e98f620c	[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC) Now that load/store alignment is required, we no longer need most of them. Also switch the getLoadStoreAlignment() helper to return Align instead of MaybeAlign.	2020-05-17 22:19:15 +02:00
Eli Friedman	4f04db4b54	AllocaInst should store Align instead of MaybeAlign. Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine. Differential Revision: https://reviews.llvm.org/D80044	2020-05-16 14:53:16 -07:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Davide Italiano	da52aa2c33	[LICM] When promoting loads to the preheader, drop the location. It's really almost going to be misleading, see the example in https://bugs.llvm.org/show_bug.cgi?id=45820 Maybe at some point we can do something fancier, but at least this will fix a bug where we step on dead code while debugging.	2020-05-14 17:05:23 -07:00
Eli Friedman	accc6b5545	LoadInst should store Align, not MaybeAlign. The fact that loads and stores can have the alignment missing is a constant source of confusion: code that usually works can break down in rare cases. So fix the LoadInst API so the alignment is never missing. To reduce the number of changes required to make this work, IRBuilder and certain LoadInst constructors will grab the module's datalayout and compute the alignment automatically. This is the same alignment instcombine would eventually apply anyway; we're just doing it earlier. There's a minor risk that the way we're retrieving the datalayout could break out-of-tree code, but I don't think that's likely. This is the last in a series of patches, so most of the necessary changes have already been merged. Differential Revision: https://reviews.llvm.org/D77454	2020-05-14 13:19:21 -07:00
Anna Thomas	eb282be9f8	[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer"" This is relanding of rGbb308b020522420413c7d3f2989a88f2fc423c56 after speculatively fixing buildbot lit test failure which was seen on two bots (I cannot reproduce the lit test failure locally either). [RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer Summary: This is a more general fix to `59029b9eef` (D75704). This patch does the following: updates isKnownBaseValue to account for base pointer and derived pointer having differing types. This inturn allows us to populate the lattice (States) for such derived pointers. It also updates all states where the base and derived pointers have differing types (vector versus scalar) and conservatively marks these states as conflictcs. Note that in `59029b9eef`, we were just fixing existing lattice values and that too, only for uses of extractelement. Reviewers: reames, skatkov, dantrushin Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D76305	2020-05-14 11:17:45 -04:00
Ehud Katz	c6c265527d	Revert "[StructurizeCFG] Fix region nodes ordering" This reverts commit `897d8ee5cd`, due to causing an infinite loop when encountering a loop with a sub-region with an inner loop.	2020-05-14 17:56:39 +03:00
Anna Thomas	f20c62741e	Revert "[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer" This reverts commit `bb308b0205`. Failing a testcase.	2020-05-14 10:16:25 -04:00
Anna Thomas	bb308b0205	[RS4GC] Fix algorithm to avoid setting vector BDV for scalar derived pointer Summary: This is a more general fix to `59029b9eef` (D75704). This patch does the following: 1. updates isKnownBaseValue to account for base pointer and derived pointer having differing types. 2. This inturn allows us to populate the lattice (States) for such derived pointers. 3. It also updates all states where the base and derived pointers have differing types (vector versus scalar) and conservatively marks these states as conflictcs. Note that in `59029b9eef`, we were just fixing existing lattice values and that too, only for uses of extractelement. Reviewers: reames, skatkov, dantrushin Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76305	2020-05-14 10:03:30 -04:00
Alina Sbirlea	bd541b217f	[NewPassManager] Add assertions when getting statefull cached analysis. Summary: Analyses that are statefull should not be retrieved through a proxy from an outer IR unit, as these analyses are only invalidated at the end of the inner IR unit manager. This patch disallows getting the outer manager and provides an API to get a cached analysis through the proxy. If the analysis is not stateless, the call to getCachedResult will assert. Reviewers: chandlerc Subscribers: mehdi_amini, eraman, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72893	2020-05-13 12:38:38 -07:00
Alina Sbirlea	db04ff4b6b	[SimpleLoopUnswitch] Add non-empty unreachable block check to exit cases removed. Summary: Update check to include the check for unreachable. Basic blocks ending in unreachable are special cased, as these blocks may be already unswitched. Before this patch this check is only done for the default destination. The condition for the exit cases and the default case must be the same, because we should never leave edges from the switch instruction to a basic block that we are unswitching. In PR45355 we still have a remaining edge (that we're attempting to remove from the DT) because its the default edge to an unreachable-terminated block where we unswitch a case edge to that block. Resolves PR45355. Reviewers: chandlerc Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78279	2020-05-13 12:38:37 -07:00
Eli Friedman	fcfb3170a7	[SROA] Clean up some uses of MaybeAlign in SROA. Use Align instead of using MaybeAlign; all the operations in question have known alignment. For getSliceAlign() in particular, in the cases where we used to return None, it would be converted back to an Align by IRBuilder, so there's no functional change there. Split off from D77454. Differential Revision: https://reviews.llvm.org/D79205	2020-05-13 11:23:29 -07:00
Reid Kleckner	1370757dd0	Revert "[BrachProbablityInfo] Set edge probabilities at once. NFC." This reverts commit `eef95f2746`. The new assertion about branch propability sums does not hold.	2020-05-13 08:23:09 -07:00
Pierre-vh	2668775f66	[LSR][ARM] Add new TTI hook to mark some LSR chains as profitable This patch adds a new TTI hook to allow targets to tell LSR that a chain including some instruction is already profitable and should not be optimized. This patch also adds an implementation of this TTI hook for ARM so LSR doesn't optimize chains that include the VCTP intrinsic. Differential Revision: https://reviews.llvm.org/D79418	2020-05-13 14:18:28 +01:00
Ehud Katz	897d8ee5cd	[StructurizeCFG] Fix region nodes ordering This is a reimplementation of the `orderNodes` function, as the old implementation didn't take into account all cases. Fix PR41509 Differential Revision: https://reviews.llvm.org/D79037	2020-05-13 15:33:36 +03:00
Yevgeny Rouban	eef95f2746	[BrachProbablityInfo] Set edge probabilities at once. NFC. Hide the method that allows setting probability for particular edge and introduce a public method that sets probabilities for all outgoing edges at once. Setting individual edge probability is error prone. More over it is difficult to check that the total probability is 1.0 because there is no easy way to know when the user finished setting all the probabilities. Reviewers: yamauchi, ebrevnov Tags: #llvm Differential Revision: https://reviews.llvm.org/D79396	2020-05-13 13:55:36 +07:00
KAWASHIMA Takahiro	272bc25bc1	[LoopReroll] Fix rerolling loop with use outside the loop Fixes PR41696 The loop-reroll pass generates an invalid IR (or its assertion fails in debug build) if values of the base instruction and other root instructions (terms used in the loop-reroll pass) are used outside the loop block. See IRs written in PR41696 as examples. The current implementation of the loop-reroll pass can reroll only loops that don't have values that are used outside the loop, except reduced values (the last values of reduction chains). This is described in the comment of the `LoopReroll::reroll` function. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1600 This is checked in the `LoopReroll::DAGRootTracker::validate` function. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1393 However, the base instruction and other root instructions skip this check in the validation loop. https://github.com/llvm/llvm-project/blob/llvmorg-10.0.0/llvm/lib/Transforms/Scalar/LoopRerollPass.cpp#L1229 Moving the check in front of the skip is the logically simplest fix. However, inserting the check in an earlier stage is better in terms of compilation time of unrerollable loops. This fix inserts the check for the base instruction into the function to validate possible base/root instructions. Check for other root instructions is unnecessary because they don't match any base instructions if they have uses outside the loop. Differential Revision: https://reviews.llvm.org/D79549	2020-05-13 13:03:03 +09:00
Eric Christopher	a42e53cccf	Fix typos encountered while working on pass pipeline for O1.	2020-05-12 00:45:15 -07:00
Layton Kifer	23cbea9a04	[TRE][NFC] Refactor shared state into member variables. Separate functions that require shared state into a class to avoid needing to pass them though multiple functions just to be available where needed. The main motivation for this is that we would like to remove the limitation that accumulator values be dynamic constant, which would require additional shared state between call eliminations in the same function, compounding this issue. Differential Revision: https://reviews.llvm.org/D79299	2020-05-08 14:36:02 -07:00
Diego Caballero	f5224d437e	[LoopFusion] Remove unreachable blocks from DT and LI after fusion This patch removes FC0.ExitBlock and FC1GuardBlock from DT and LI after fusion of guarded loops. They become unreachable and LI verification failed when they happened to be inside another loop. Reviewed By: kbarton Differential Revision: https://reviews.llvm.org/D78679	2020-05-07 16:44:40 -07:00
Alina Sbirlea	6227f021ad	[SimpleLoopUnswitch] Update DefaultExit condition to check unreachable is not empty. Summary: Update the check for the default exit block to not only check that the terminator is not unreachable, but also check that unreachable block has only the unreachable instruction. Reviewers: chandlerc Subscribers: hiraditya, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78277	2020-05-07 13:48:30 -07:00
Whitney Tsang	0a52401ad6	[LoopUnrollAndJam] Changed safety checks to consider more than 2-levels loop nest. Summary: As discussed in https://reviews.llvm.org/D73129. Example Before unroll and jam: for A for B for C D E After unroll and jam (currently): for A A' for B for C D B' for C' D' E E' After unroll and jam (Ideal): for A A' for B B' for C C' D D' E E' This is the first patch to change unroll and jam to work in the ideal way. This patch change the safety checks needed to make sure is safe to unroll and jam in the ideal way. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: Meinersbur Subscribers: fhahn, hiraditya, zzheng, llvm-commits, anhtuyen, prithayan Tag: LLVM Differential Revision: https://reviews.llvm.org/D76132	2020-05-06 21:47:44 +00:00
Benjamin Kramer	d5ea89f891	Quiet some -Wdocumentation warnings.	2020-05-06 11:23:13 +02:00
David Green	146d44c251	[LSR] Don't require register reuse under postinc LSR has some logic that tries to aggressively reuse registers in formula. This can lead to sub-optimal decision in complex loops where the backend it trying to use shouldFavorPostInc. This disables the re-use in those situations. Differential Revision: https://reviews.llvm.org/D79301	2020-05-05 16:04:50 +01:00
Sam Parker	40574fefe9	[NFC][CostModel] Add TargetCostKind to relevant APIs Make the kind of cost explicit throughout the cost model which, apart from making the cost clear, will allow the generic parts to calculate better costs. It will also allow some backends to approximate and correlate the different costs if they wish. Another benefit is that it will also help simplify the cost model around immediate and intrinsic costs, where we currently have multiple APIs. RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/141263.html Differential Revision: https://reviews.llvm.org/D79002	2020-05-05 10:35:54 +01:00
Florian Hahn	935685f420	[SCCP] Re-use pushToWorkList in pushToWorkListMsg (NFC). There's no need to duplicate the logic to push to the different work-lists.	2020-05-04 10:19:39 +01:00
Nikita Popov	b7e2358220	Remove getNumUses() comparisons (NFC) getNumUses() scans the full use list. Don't use it is we only want to check if there's zero or one uses.	2020-05-02 11:05:19 +02:00
Florian Hahn	d911c17596	[SCCP] Get a copy of the state of CopyOf once. This fixes potential reference invalidations, when no lattice value is assigned for CopyOf. As the state of CopyOf won't change while in handleCallResult, we can get a copy once and use that. Should fix PR45749.	2020-05-01 14:46:35 +01:00
Benjamin Kramer	7a5a1e9460	[IR] AttributeList::getContext has a single user, remove it.	2020-05-01 14:18:29 +02:00
Arthur Eubanks	a90948fd6e	[NFC] Rename ByValOrInalloca to PassPointeeByValue Summary: In preparation for preallocated. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79152	2020-04-30 09:42:13 -07:00
Evgeniy Brevnov	3acf62f3ad	[BPI][NFC] IRCE shoud qequest BPI through analysis manager. Summary: There is no need to create BPI explicitly. It should be requested through AM in a normal way. Reviewers: skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79080	2020-04-30 16:04:06 +07:00
Evgeniy Brevnov	3e68a66704	[BPI][NFC] Reuse post dominantor tree from analysis manager when available Summary: Currenlty BPI unconditionally creates post dominator tree each time. While this is not incorrect we can save compile time by reusing existing post dominator tree (when it's valid) provided by analysis manager. Reviewers: skatkov, taewookoh, yrouban Reviewed By: skatkov Subscribers: hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78987	2020-04-30 11:31:03 +07:00
Florian Hahn	616657b39c	[LAA] Move CheckingPtrGroup/PointerCheck outside class (NFC). This allows forward declarations of PointerCheck, which in turn reduce the number of times LoopAccessAnalysis needs to be included. Ultimately this helps with moving runtime check generation to Transforms/Utils/LoopUtils.h, without having to include it there. Reviewers: anemet, Ayal Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D78458	2020-04-28 21:47:31 +01:00
Sam Parker	e9c9329aa4	[TTI] Add TargetCostKind argument to getUserCost There are several different types of cost that TTI tries to provide explicit information for: throughput, latency, code size along with a vague 'intersection of code-size cost and execution cost'. The vectorizer is a keen user of RecipThroughput and there's at least 'getInstructionThroughput' and 'getArithmeticInstrCost' designed to help with this cost. The latency cost has a single use and a single implementation. The intersection cost appears to cover most of the rest of the API. getUserCost is explicitly called from within TTI when the user has been explicit in wanting the code size (also only one use) as well as a few passes which are concerned with a mixture of size and/or a relative cost. In many cases these costs are closely related, such as when multiple instructions are required, but one evident diverging cost in this function is for div/rem. This patch adds an argument so that the cost required is explicit, so that we can make the important distinction when necessary. Differential Revision: https://reviews.llvm.org/D78635	2020-04-28 08:57:45 +01:00
Craig Topper	a58b62b4a2	[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882	2020-04-27 22:17:03 -07:00
Florian Hahn	2f3e86b318	[DSE,MSSA] Continue checking more remaining candidates with dbgcnt. After changing the candidate iteration strategy, we should continue with the next candidate, rather than breaking out of the loop.	2020-04-26 16:59:32 +01:00
Florian Hahn	7d57d22baa	[SCCP] Support ranges for loads and stores. Integer ranges can be used for loaded/stored values. Note that widening can be disabled for loads/stores, as we only rely on instructions that cause continued increases to ranges to be widened (like binary operators). Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78433	2020-04-26 13:16:47 +01:00
Simon Pilgrim	a3982491db	[Pass] Ensure we don't include PassSupport.h or PassAnalysisSupport.h directly Both PassSupport.h and PassAnalysisSupport.h are only supposed to be included via Pass.h. Differential Revision: https://reviews.llvm.org/D78815	2020-04-26 12:58:20 +01:00
Nikita Popov	164845cd92	[GVN] Reduce expression size (NFC) Reduce size of GVN::Expression by reordering fields to reduce padding.	2020-04-26 09:43:35 +02:00
Florian Hahn	46a04940e8	[DSE] Add stat for remaining stores after DSE. Using the existing NumFastStores statistic can be misleading when comparing the impact of DSE patches. For example, consider the case where a store gets removed from a function before it is inlined into another function. A less powerful DSE might only remove the store from functions it has been inlined into, which will result in more stores being removed, but no difference in the actual number of stores after DSE. The new stat provides the absolute number of stores surviving after DSE. Reviewers: dmgreen, bryant, asbirlea, jfb Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D78830	2020-04-25 16:12:55 +01:00
Florian Hahn	e1235831c4	[DSE,MSSA] Improve debug output (NFC). This patch slightly improves the formatting of the debug output, adds a few missing outputs and makes some existing outputs more consistent with the rest.	2020-04-24 17:50:08 +01:00
Florian Hahn	44ce588670	[DSE,MSSA] Skip checking write clobber for DomAccess (NFC). There is no need to check if the starting access for is a write clobber and all of its uses have already been checked.	2020-04-24 17:16:22 +01:00
Mircea Trofin	b8960b5d81	[llvm][NFC][CallSite] Remove remaining {Immutable}CallSite uses Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78789	2020-04-23 22:19:39 -07:00
Craig Topper	cbe77ca9bd	[CallSite removal] Remove unneeded includes of CallSite.h. NFC	2020-04-23 21:01:48 -07:00
Christopher Tetreault	7ca56c90bd	[SVE] Remove calls to isScalable from Transforms Reviewers: efriedma, chandlerc, reames, aprantl, sdesmalen Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77756	2020-04-23 13:50:07 -07:00
Mircea Trofin	ceb7f308b8	[llvm][NFC][CallSite] Removed CallSite from few implementation details Reviewers: dblaikie, craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78724	2020-04-23 10:36:36 -07:00
Juneyoung Lee	aca335955c	[ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 08:08:53 +09:00
Juneyoung Lee	5ceef26350	Revert "RFC: [ValueTracking] Let analyses assume a value cannot be partially poison" This reverts commit `80faa8c3af`.	2020-04-23 08:07:09 +09:00
Juneyoung Lee	80faa8c3af	RFC: [ValueTracking] Let analyses assume a value cannot be partially poison Summary: This is RFC for fixes in poison-related functions of ValueTracking. These functions assume that a value can be poison bitwisely, but the semantics of bitwise poison is not clear at the moment. Allowing a value to have bitwise poison adds complexity to reasoning about correctness of optimizations. This patch makes the analysis functions simply assume that a value is either fully poison or not, which has been used to understand the correctness of a few previous optimizations. The bitwise poison semantics seems to be only used by these functions as well. In terms of implementation, using value-wise poison concept makes existing functions do more precise analysis, which is what this patch contains. Reviewers: spatel, lebedev.ri, jdoerfert, reames, nikic, nlopes, regehr Reviewed By: nikic Subscribers: fhahn, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78503	2020-04-23 07:57:12 +09:00
Florian Hahn	352b612a71	[SCCP] Drop unnecessary early exit for ExtractValueInst. visitExtractValueInst uses mergeInValue, so it already can handle constant ranges. Initially the early exit was using isOverdefined to keep things as NFC during the initial move to ValueLatticeElement. As the function already supports constant ranges, it can just use ValueState[&I].isOverdefined. Reviewers: efriedma, mssimpso, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78393	2020-04-22 22:07:59 +01:00
Craig Topper	be04aba6fc	[CallSite removal][ValueTracking] Use CallBase instead of ImmutableCallSite for getIntrinsicForCallSite. NFC Differential Revision: https://reviews.llvm.org/D78613	2020-04-22 12:06:58 -07:00
Mircea Trofin	1b6b05a250	[llvm][NFC][CallSite] Remove CallSite from a few trivial locations Summary: Implementation details and internal (to module) APIs. Reviewers: craig.topper, dblaikie Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78610	2020-04-22 08:39:21 -07:00
Craig Topper	05a11974ae	[CallSite removal] Remove unneeded includes of CallSite.h. NFC	2020-04-22 00:07:13 -07:00
Benjamin Kramer	9a08c30705	Bit-pack some pairs. No functionlity change intended.	2020-04-21 20:40:20 +02:00
Max Kazantsev	a116f0fa86	[LICM][NFC] Reorder checks to speed up things slightly Side effect check is made faster than potentially heavy other checks.	2020-04-21 11:34:44 +07:00
Craig Topper	68b2e507e4	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 21:31:44 -07:00
Craig Topper	fcc9d70260	Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign." This is breaking the clang build. This reverts commit `897409fb56`.	2020-04-20 13:25:06 -07:00
Craig Topper	897409fb56	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 13:08:05 -07:00
Nikita Popov	54d01cbc15	[IPT] Don't use OrderedInstructions (NFC) Use Instruction::comesBefore() instead of OrderedInstructions inside InstructionPrecedenceTracking. This also removes the dominator tree dependency. Differential Revision: https://reviews.llvm.org/D78461	2020-04-20 18:25:31 +02:00
Bjorn Pettersson	a8a31fdd80	[Scalarizer] Fix a non-deterministic scatter order problem Summary: The indexing operator in Scatterer may result in building new instructions. When using multiple such operators in a function argument list the order in which we build instructions depend on argument evaluation order (which is undefined in C++). This patch avoid such problems by expanding the components using the [] operator prior to the function call. Problem was seen when comparing output, while builing LLVM with different compilers (clang vs gcc). Reviewers: foad, cameron.mcinally, uabelho Reviewed By: foad Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78455	2020-04-20 16:05:33 +02:00
Craig Topper	53ee8fbc23	[CallSite removal][SCCP] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78470	2020-04-20 00:16:09 -07:00
Florian Hahn	a7aaadc135	[TTI] Clean up includes (NFC). Remove some unnecessary includes, replace some with forward declarations. This also exposed a few places that were missing some includes.	2020-04-19 20:11:59 +01:00
Florian Hahn	32af48cdcf	[IVDescriptors] Clean up includes. Some includes are not required and forward declarations can be used instead. This also exposed a few places that were not directly including required files.	2020-04-19 20:07:47 +01:00
Florian Hahn	6ba0695c60	[ValueLattice] Add struct for merge options. This makes it easier to extend the merge options in the future and also reduces the risk of accidentally setting a wrong option. Reviewers: efriedma, nikic, reames, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78368	2020-04-19 09:03:16 +01:00
Florian Hahn	46853b95ca	[SCCP] Drop unused early exit from visitStoreInst (NFC). There are no lattice values associated with store instructions directly. They will never get marked as overdefined.	2020-04-18 19:44:54 +01:00
Florian Hahn	034e8d58a8	[SCCP] Drop unused early exit from visitReturnInst (NFC). There are no lattice values associated with return instructions directly. They will never get marked as overdefined.	2020-04-18 13:52:41 +01:00
Craig Topper	0feaba683e	[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC There are also some adjustments to use MaybeAlign in here due to CallBase::getParamAlignment() being deprecated. It would be a little cleaner if getOrEnforceKnownAlignment was migrated to Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78345	2020-04-17 10:32:45 -07:00
Craig Topper	8c94d616e1	Revert "[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC" There were extra changes that weren't supposed to be in there This reverts commit `b91f78db37`.	2020-04-17 10:11:22 -07:00
Craig Topper	b91f78db37	[CallSite removal][MemCpyOptimizer] Replace CallSite with CallBase. NFC There are also some adjustments to use MaybeAlign in here due to CallBase::getParamAlignment() being deprecated. It would be cleaner if getOrEnforceKnownAlignment was migrated to Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78345	2020-04-17 10:07:20 -07:00
Florian Hahn	c245d3e033	[ValueLattice] Steal bits from Tag to track range extensions (NFC). Users of ValueLatticeElement currently have to ensure constant ranges are not extended indefinitely. For example, in SCCP, mergeIn goes to overdefined if a constantrange value is repeatedly merged with larger constantranges. This is a simple form of widening. In some cases, this leads to an unnecessary loss of information and things can be improved by allowing a small number of extensions in the hope that a fixed point is reached after a small number of steps. To make better decisions about widening, it is helpful to keep track of the number of range extensions. That state is tied directly to a concrete ValueLatticeElement and some unused bits in the class can be used. The current patch preserves the existing behavior by default: CheckWiden defaults to false and if CheckWiden is true, a single change to the range is allowed. Follow-up patches will slightly increase the threshold for widening. Reviewers: efriedma, davide, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78145	2020-04-17 15:38:23 +01:00
Benjamin Kramer	166467e822	[VectorUtils] Create shufflevector masks as int vectors instead of Constants No functionality change intended.	2020-04-17 15:28:00 +02:00
Bjorn Pettersson	fdf9bad573	[Float2Int] Stop passing around a reference to the class member Roots. NFC The Float2IntPass got a class member called Roots, but Roots was also passed around to member function as a reference. This patch simply remove those references.	2020-04-16 15:24:13 +02:00
Alina Sbirlea	edccc35e8f	[Reassociate] Preserve AAManager and BasicAA analyses. Now Reassociate Pass invalidates the analysis results of AAManager and BasicAA, but it saves GlobalsAA, although it seems that it should preserve them, since it affects only Unary and Binary operators. Author: kpolushin (Kirill) Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D77137	2020-04-15 16:58:03 -07:00
Craig Topper	240725666a	[CallSite removal][CallSiteSplitting] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78240	2020-04-15 15:38:02 -07:00
Craig Topper	592d8e7d75	[CallSite removal][SimpleLoopUnswitch] Use CallBase instead of CallSite. NFC Differential Revision: https://reviews.llvm.org/D78227	2020-04-15 13:25:02 -07:00
Davide Italiano	5f87415efc	[LICM] Try to merge debug locations when sinking. The current strategy LICM uses when sinking for debuginfo is that of picking the debug location of one of the uses. This causes stepping to be wrong sometimes, see, e.g. PR45523. This patch introduces a generalization of getMergedLocation(), that operates on a vector of locations instead of two, and try to merge all them together, and use the new API in LICM. <rdar://problem/61750950>	2020-04-15 12:29:34 -07:00
Benjamin Kramer	6f64daca8f	Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints No functionality change intended.	2020-04-15 12:51:38 +02:00
Florian Hahn	cf9ee49b4d	[DSE] Lift post-dominance for objs not accessible in caller. We can eliminate MemoryDefs of objects not accessible after the function returns (e.g. alloca), if there are no reads between the MemoryDef and any function exits. We can stop traversing paths that completely overwrite the memory location of the MemoryDef. This patch was split off D73763. Reviewers: dmgreen, bryant, asbirlea, Tyker, efriedma, george.burgess.iv Reviewed By: asbirlea, george.burgess.iv Differential Revision: https://reviews.llvm.org/D77736	2020-04-15 11:37:14 +01:00
Max Kazantsev	f8a42bca28	[ADCE] Fix incorrect reporting of CFG changes This patch fixes 2 related bugs in ADCE: - `performDeadCodeElimination` does not report changes if it did ONLY CFG changes (affects both old and new pass managers); - When control flow removal is enabled, new pass manager does not drop CFG analyses. Both can lead to incorrect loop info after ADCE that does only CFG changes. Differential Revision: https://reviews.llvm.org/D78103 Reviewed By: Denis Antrushin	2020-04-14 20:26:13 +07:00
Aaron Puchert	e833e58300	[ValueLattice] Remove unused DataLayout parameter of mergeIn, NFC Reviewed By: fhahn, echristo Differential Revision: https://reviews.llvm.org/D78061	2020-04-14 13:32:53 +02:00
Florian Hahn	38609fa9e4	Recommit "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef." This includes a fix reported with simplifications in the presence of NaN. This reverts the revert commit `06408451bf`.	2020-04-14 11:48:52 +01:00
Tyker	3bdfa966ec	[AssumeBundles] preserve knowledge in DCE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77403	2020-04-14 12:48:15 +02:00
Tyker	086de7673e	[AssumeBundles] preserve knowledge in DSE Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77404	2020-04-14 12:48:15 +02:00
Tyker	de4dc275f5	[AssumeBundles] preserve information in NewGVN Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: Prazek, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77406	2020-04-14 12:48:14 +02:00
Tyker	c35194b800	[AssumeBundles] preserve information in LICM Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77407	2020-04-14 12:48:14 +02:00
Tyker	1d2b76a8fc	[AssumeBundles] adapte GVN to assume bundles Summary: prevent GVN from removing assume bundles make GVN preserve information from removed instructions Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77405	2020-04-14 12:48:14 +02:00
Mircea Trofin	4aae4e3f48	[llvm][NFC] CallSite removal from inliner-related files Summary: This removes CallSite from inliner files. Some dependencies where thus affected. Reviewers: dblaikie, davidxl, craig.topper Subscribers: arsenm, jvesely, nhaehnle, eraman, hiraditya, aheejin, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77991	2020-04-13 21:28:58 -07:00
Tyker	813f438baa	[AssumeBundles] adapt Assumption cache to assume bundles Summary: change assumption cache to store an assume along with an index to the operand bundle containing the knowledge. Reviewers: jdoerfert, hfinkel Reviewed By: jdoerfert Subscribers: hiraditya, mgrang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77402	2020-04-13 12:04:51 +02:00
Benjamin Kramer	06408451bf	Revert "[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef." This reverts commit `1a02aaeaa4`. Crashes on the following test case: $ cat crash.ll source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" @0 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\00\00\C0\7F\09\85\08?\ED\C94\FE~\EB/\F3\90\CF\BA\C1" @1 = private unnamed_addr constant [24 x i8] c"\00\00\C0\7F\A3\A0\0FA\00\00\C0\7F\00\00\C0\7F\00\00\00\00\02\9AA\00" define void @IgammaSpecialValues.448() { entry: br label %fusion.26.loop_header.dim.0 fusion.26.loop_header.dim.0: ; preds = %fusion.26.loop_header.dim.0, %entry %fusion.26.invar_address.dim.0.0 = phi i64 [ 0, %entry ], [ %invar.inc17, %fusion.26.loop_header.dim.0 ] %0 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @0 to [6 x float]), i64 0, i64 %fusion.26.invar_address.dim.0.0 %1 = load float, float %0 %2 = fmul float %1, 0.000000e+00 %3 = getelementptr inbounds [6 x float], [6 x float]* bitcast ([24 x i8]* @1 to [6 x float]), i64 0, i64 %fusion.26.invar_address.dim.0.0 %4 = load float, float %3 %5 = fneg float %4 %6 = fadd float %2, %5 %invar.inc17 = add nuw nsw i64 %fusion.26.invar_address.dim.0.0, 1 br label %fusion.26.loop_header.dim.0 } $ opt -ipsccp -S < crash.ll opt: llvm/include/llvm/Analysis/ValueLattice.h:251: bool llvm::ValueLatticeElement::markConstant(llvm::Constant *, bool): Assertion `getConstant() == V && "Marking constant with different value"' failed.	2020-04-13 11:23:26 +02:00
Florian Hahn	1a02aaeaa4	[SCCP] Use SimplifyBinOp for non-integer constant/expressions & overdef. For non-integer constants/expressions and overdefined, I think we can just use SimplifyBinOp to do common folds. By just passing a context with the DL, SimplifyBinOp should not try to get additional information from looking at definitions. For overdefined values, it should be enough to just pass the original operand. Note: The comment before the `if (isconstant(V1State)...` was wrong originally: isConstant() also matches integer ranges with a single element. It is correct now. Reviewers: efriedma, davide, mssimpso, aartbik Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76459	2020-04-10 11:02:57 +01:00
Max Kazantsev	4e87823026	[LoopLoadElim] Fix crash by always checking simplify form Loop simplify form should always be checked because logic of propagateStoredValueToLoadUsers relies on it (in particular, it requires preheader). Reviewed By: Fedor Sergeev, Florian Hahn Differential Revision: https://reviews.llvm.org/D77775	2020-04-10 09:23:28 +07:00
Christopher Tetreault	19cc9b9ded	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: efriedma, sdesmalen, rriddle Reviewed By: sdesmalen Subscribers: hiraditya, dantrushin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77261	2020-04-09 14:59:14 -07:00
Florian Hahn	bbbec71609	[DSE.MSSA] Only use callCapturesBefore for calls. callCapturesBefore always returns ModRef , if UseInst isn't a call. As we only call it if we already know Mod is set, this only destroys the Must bit for non-calls.	2020-04-08 15:12:33 +01:00
Florian Hahn	a6353fdf3b	[DSE,MSSA] Hoist getMemoryAccess call (NFC).	2020-04-08 15:10:05 +01:00
Max Kazantsev	7adb9e06fd	[LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly with new PM	2020-04-08 17:32:03 +07:00
Kazu Hirata	91eb442fde	[JumpThreading] NFC: Simplify ComputeValueKnownInPredecessorsImpl Summary: ComputeValueKnownInPredecessorsImpl is the main folding mechanism in JumpThreading.cpp. To avoid potential infinite recursion while chasing use-def chains, it uses: DenseSet<std::pair<Value , BasicBlock >> &RecursionSet to keep track of Value-BB pairs that we've processed. Now, when ComputeValueKnownInPredecessorsImpl recursively calls itself, it always passes BB as is, so the second element is always BB. This patch simplifes the function by dropping "BasicBlock *" from RecursionSet. Reviewers: wmi, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77699	2020-04-07 18:37:36 -07:00
Florian Hahn	6aabb109be	[SCCP] Use ranges for predicate info conditions. This patch updates the code that deals with conditions from predicate info to make use of constant ranges. For ssa_copy instructions inserted by PredicateInfo, we have 2 ranges: 1. The range of the original value. 2. The range imposed by the linked condition. 1. is known, 2. can be determined using makeAllowedICmpRegion. The intersection of those ranges is the range for the copy. With this patch, we get a nice increase in the number of instructions eliminated by both SCCP and IPSCCP for some benchmarks: For MultiSource, SPEC2000 & SPEC2006: Tests: 237 Same hash: 170 (filtered out) Remaining: 67 Metric: sccp.NumInstRemoved Program base patch diff test-suite...Source/Benchmarks/sim/sim.test 10.00 71.00 610.0% test-suite...CFP2000/177.mesa/177.mesa.test 361.00 1626.00 350.4% test-suite...encode/alacconvert-encode.test 141.00 602.00 327.0% test-suite...decode/alacconvert-decode.test 141.00 602.00 327.0% test-suite...CI_Purple/SMG2000/smg2000.test 1639.00 4093.00 149.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 75.00 163.00 117.3% test-suite...T2006/401.bzip2/401.bzip2.test 358.00 513.00 43.3% test-suite...rks/FreeBench/pifft/pifft.test 11.00 15.00 36.4% test-suite...langs-C/unix-tbl/unix-tbl.test 4.00 5.00 25.0% test-suite...lications/sqlite3/sqlite3.test 541.00 667.00 23.3% test-suite.../CINT2000/254.gap/254.gap.test 243.00 299.00 23.0% test-suite...ks/Prolangs-C/agrep/agrep.test 25.00 29.00 16.0% test-suite...marks/7zip/7zip-benchmark.test 1135.00 1304.00 14.9% test-suite...lications/ClamAV/clamscan.test 1105.00 1268.00 14.8% test-suite...urce/Applications/lua/lua.test 398.00 436.00 9.5% Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...C/CFP2000/179.art/179.art.test 1.00 3.00 200.0% test-suite...006/447.dealII/447.dealII.test 429.00 1056.00 146.2% test-suite...nch/fourinarow/fourinarow.test 3.00 7.00 133.3% test-suite...CI_Purple/SMG2000/smg2000.test 818.00 1748.00 113.7% test-suite...ks/McCat/04-bisect/bisect.test 3.00 5.00 66.7% test-suite...CFP2000/177.mesa/177.mesa.test 165.00 255.00 54.5% test-suite...ediabench/gsm/toast/toast.test 18.00 27.00 50.0% test-suite...telecomm-gsm/telecomm-gsm.test 18.00 27.00 50.0% test-suite...ks/Prolangs-C/agrep/agrep.test 24.00 35.00 45.8% test-suite...TimberWolfMC/timberwolfmc.test 43.00 62.00 44.2% test-suite...encode/alacconvert-encode.test 46.00 66.00 43.5% test-suite...decode/alacconvert-decode.test 46.00 66.00 43.5% test-suite...langs-C/unix-tbl/unix-tbl.test 12.00 17.00 41.7% test-suite...peg2/mpeg2dec/mpeg2decode.test 31.00 41.00 32.3% test-suite.../CINT2000/254.gap/254.gap.test 117.00 154.00 31.6% Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76611	2020-04-07 11:09:18 +01:00
Eli Friedman	3f13ee8a00	[NFC] Modernize misc. uses of Align/MaybeAlign APIs. Use the current getAlign() APIs where it makes sense, and use Align instead of MaybeAlign when we know the value is non-zero.	2020-04-06 17:53:04 -07:00
Eli Friedman	68b03aee1a	Remove SequentialType from the type heirarchy. Now that we have scalable vectors, there's a distinction that isn't getting captured in the original SequentialType: some vectors don't have a known element count, so counting the number of elements doesn't make sense. In some cases, there's a better way to express the commonality using other methods. If we're dealing with GEPs, there's GEP methods; if we're dealing with a ConstantDataSequential, we can query its element type directly. In the relatively few remaining cases, I just decided to write out the type checks. We're talking about relatively few places, and I think the abstraction doesn't really carry its weight. (See thread "[RFC] Refactor class hierarchy of VectorType in the IR" on llvmdev.) Differential Revision: https://reviews.llvm.org/D75661	2020-04-06 17:03:49 -07:00
Kirill Naumov	3f995ce8b5	[CFGPrinter][CallPrinter][polly] Adding distinct structure for CFGDOTInfo The patch introduces the system to distinctively store the information needed for the Control Flow Graph as well as the instrumentary needed for the follow-up changes: BlockFrequencyInfo and BranchProbabilityInfo. The patch is a part of sequence of three patches, related to graphs Heat Coloring. Reviewers: rcorcs, apilipenko, davidxl, sfertile, fedor.sergeev, eraman, bollu Differential Revision: https://reviews.llvm.org/D76820	2020-04-06 17:42:54 +00:00
Guillaume Chatelet	808286342a	[Alignment][NFC] Assume AlignmentFromAssumptions::getNewAlignment is always set. Summary: In D77454 we explain that `LoadInst` and `StoreInst` always have their alignment defined. This allows to work backward here and to infer that `getNewAlignment` does not need to return `0` in case of failure. Returning `1` also works since it needs to be greater than the Load/Store alignment which is a least `1`. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77538	2020-04-06 14:54:57 +00:00
Florian Hahn	6babae74c7	[Matrix] Update load/storeMatrix to take indices as Value* (NFC). This allows using the functions to be used with loop dependent indices.	2020-04-06 14:48:48 +01:00
Florian Hahn	39f2d9aa81	[Matrix] Add option to use row-major matrix layout as default. This patch adds a -matrix-default-layout option which can be used to set the default matrix layout to row-major or column-major (default). The initial patch updates codegen for loads, stores, binary operators and matrix multiply. Reviewers: anemet, Gerolf, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76325	2020-04-06 10:00:56 +01:00
Florian Hahn	d1fed7081d	[Matrix] Add initial tiling for load/multiply/store chains. This patch adds initial fusion for load/multiply/store chains of matrix operations. The patch contains roughly two parts: 1. Code generation for a fused load/multiply/store chain (LowerMatrixMultiplyFused). First, we ensure that both loads of the multiply operands do not alias the store. If they do, we create new non-aliasing copies of the operands. Note that this may introduce new basic block. Finally we process TileSize x TileSize blocks. That is: load tiles from the input operands, multiply and store them. 2. Identify fusion candidates & matrix instructions. As a first step, collect all instructions with shape info and fusion candidates (currently @llvm.matrix.multiply calls). Next, try to fuse candidates and collect instructions eliminated by fusion. Finally iterate over all matrix instructions, skip the ones eliminated by fusion and lower the rest as usual. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75566	2020-04-06 09:28:15 +01:00
Guillaume Chatelet	6000478f39	Revert "[Alignment][NFC] Add DebugStr and operator*" This reverts commit `1e34ab98fc`.	2020-04-06 07:55:25 +00:00
Guillaume Chatelet	1e34ab98fc	[Alignment][NFC] Add DebugStr and operator* Summary: Also updates files to use them. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77394	2020-04-06 07:12:46 +00:00
Nikita Popov	ebd5a1b049	[Reassociate] Use IRBuilderBase; NFC And remove now unnecessary IRBuilder.h include in header.	2020-04-04 12:34:16 +02:00
Nikita Popov	1055e9e3c8	[IVDescriptors] Remove IRBuilder.h include; NFC IVDescriptors.h itself does not reference IRBuilder at all. Move the include into transformation passes that do.	2020-04-04 12:07:57 +02:00
OCHyams	9b56cc9361	[DebugInfo] Salvage debug info when sinking loop invariant instructions Reviewed By: vsk, aprantl, djtodoro Differential Revision: https://reviews.llvm.org/D77318	2020-04-03 09:19:26 +01:00
Benjamin Kramer	de8831934a	[LoopDataPrefetch] Remove unused include that's a layering violation	2020-04-02 17:46:10 +02:00
Tyker	c00cb76274	[NFC] Split Knowledge retention and place it more appropriatly Summary: Splitting Knowledge retention into Queries in Analysis and Builder into Transform/Utils allows Queries and Transform/Utils to use Analysis. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77171	2020-04-02 15:01:41 +02:00
Jonas Paulsson	36d4421f50	[LoopDataPrefetch + SystemZ] Let target decide on prefetching for each loop. This patch adds - New arguments to getMinPrefetchStride() to let the target decide on a per-loop basis if software prefetching should be done even with a stride within the limit of the hw prefetcher. - New TTI hook enableWritePrefetching() to let a target do write prefetching by default (defaults to false). - In LoopDataPrefetch: - A search through the whole loop to gather information before emitting any prefetches. This way the target can get information via new arguments to getMinPrefetchStride() and emit prefetches more selectively. Collected information includes: Does the loop have a call, how many memory accesses, how many of them are strided, how many prefetches will cover them. This is NFC to before as long as the target does not change its definition of getMinPrefetchStride(). - If a previous access to the same exact address was 'read', and the current one is 'write', make it a 'write' prefetch. - If two accesses that are covered by the same prefetch do not dominate each other, put the prefetch in a block that dominates both of them. - If a ConstantMaxTripCount is less than ItersAhead, then skip the loop. - A SystemZ implementation of getMinPrefetchStride(). Review: Ulrich Weigand, Michael Kruse Differential Revision: https://reviews.llvm.org/D70228	2020-04-02 14:57:46 +02:00
Florian Hahn	a63b5c9e53	[CallSiteSplitting] Simplify isPredicateOnPHI & continue checking PHIs. As pointed out by @thakis, currently CallSiteSplitting bails out after checking the first PHI node. We should check all PHI nodes, until we find one where call site splitting is beneficial. This patch also slightly simplifies the code using BasicBlock::phis(). Reviewers: davidxl, junbuml, thakis Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D77089	2020-04-02 10:11:27 +01:00
Jonathan Roelofs	1148f004fa	Fix PR45371: SeparateConstOffsetFromGEP clean up bookkeeping find() was altering the UserChain, even in cases where it subsequently discovered that the resulting constant was a 0. This confuses rebuildWithoutConstOffset() when it attempts to walk the chain later, since it is expected that the chain itself be a path down the use-def edges of an expression.	2020-04-01 12:38:15 -06:00
Uday Bondhugula	6ee11c3b0f	[NewGVN] Make NewGVN aware of aligned_alloc Make the New GVN pass aware of aligned_alloc. Depends on D76975. Differential Revision: https://reviews.llvm.org/D76976	2020-04-01 23:26:51 +05:30
Uday Bondhugula	4cf70af94f	[GVN] Make GVN aware of aligned_alloc Make the GVN pass aware of aligned_alloc. Depends on D76974. Differential Revision: https://reviews.llvm.org/D76975	2020-04-01 23:26:50 +05:30
Benjamin Kramer	66b9f5f7f0	[GVNSink] Simplify code. NFC.	2020-04-01 13:13:00 +02:00
Cullen Rhodes	84aa6cf1a9	[Transforms][SROA] Promote allocas with mem2reg for scalable types Summary: Aggregate types containing scalable vectors aren't supported and as far as I can tell this pass is mostly concerned with optimisations on aggregate types, so the majority of this pass isn't very useful for scalable vectors. This patch modifies SROA such that mem2reg is run on allocas with scalable types that are promotable, but nothing else such as slicing is done. The use of TypeSize in this pass has also been updated to be explicitly fixed size. When invoking the following methods in DataLayout: * getTypeSizeInBits * getTypeStoreSize * getTypeStoreSizeInBits * getTypeAllocSize we now called getFixedSize on the resultant TypeSize. This is quite an extensive change with around 50 calls to these functions, and also the first change of this kind (being explicit about fixed vs scalable size) as far as I'm aware, so feedback welcome. A test is included containing IR with scalable vectors that this pass is able to optimise. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76720	2020-04-01 10:34:11 +00:00
Eli Friedman	ba4764c2cc	Fix leak in GVNSink introduced in D72467.	2020-03-31 16:21:27 -07:00
Evgenii Stepanov	f9471b0010	Fix MSan false positive due to select folding. Summary: Select folding in JumpThreading can create a conditional branch on a code patch that did not have one in the original program. This is not a valid transformation in sanitize_memory functions. Note that JumpThreading does select folding in 3 different places. Two of them seem safe - they apply to a select instruction in a BB that ends with an unconditional branch to another BB, which (in turn) ends with a conditional branch or a switch with the same condition. Fixes PR45220. Reviewers: glider, dvyukov, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76332	2020-03-31 15:25:42 -07:00
Eli Friedman	1ee6ec2bf3	Remove "mask" operand from shufflevector. Instead, represent the mask as out-of-line data in the instruction. This should be more efficient in the places that currently use getShuffleVector(), and paves the way for further changes to add new shuffles for scalable vectors. This doesn't change the syntax in textual IR. And I don't currently plan to change the bitcode encoding in this patch, although we'll probably need to do something once we extend shufflevector for scalable types. I expect that once this is finished, we can then replace the raw "mask" with something more appropriate for scalable vectors. Not sure exactly what this looks like at the moment, but there are a few different ways we could handle it. Maybe we could try to describe specific shuffles. Or maybe we could define it in terms of a function to convert a fixed-length array into an appropriate scalable vector, using a "step", or something like that. Differential Revision: https://reviews.llvm.org/D72467	2020-03-31 13:08:59 -07:00
Florian Hahn	b0cd7b2799	[SCCP] Limit use of range info for binops to integers for now. This fixes a crash when building the test suite.	2020-03-31 17:08:09 +01:00
Tyker	4aeb7e1ef4	[AssumeBundles] Preserve information in EarlyCSE Summary: this patch preserve information from various places in EarlyCSE into assume bundles. Reviewers: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76769	2020-03-31 17:47:04 +02:00
Florian Hahn	b37543750c	[ValueLattice] Distinguish between constant ranges with/without undef. This patch updates ValueLattice to distinguish between ranges that are guaranteed to not include undef and ranges that may include undef. A constant range guaranteed to not contain undef can be used to simplify instructions to arbitrary values. A constant range that may contain undef can only be used to simplify to a constant. If the value can be undef, it might take a value outside the range. For example, consider the snipped below define i32 @f(i32 %a, i1 %c) { br i1 %c, label %true, label %false true: %a.255 = and i32 %a, 255 br label %exit false: br label %exit exit: %p = phi i32 [ %a.255, %true ], [ undef, %false ] %f.1 = icmp eq i32 %p, 300 call void @use(i1 %f.1) %res = and i32 %p, 255 ret i32 %res } In the exit block, %p would be a constant range [0, 256) including undef as %p could be undef. We can use the range information to replace %f.1 with false because we remove the compare, effectively forcing the use of the constant to be != 300. We cannot replace %res with %p however, because if %a would be undef %cond may be true but the second use might not be < 256. Currently LazyValueInfo uses the new behavior just when simplifying AND instructions and does not distinguish between constant ranges with and without undef otherwise. I think we should address the remaining issues in LVI incrementally. Reviewers: efriedma, reames, aqjune, jdoerfert, sstefan1 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D76931	2020-03-31 12:50:20 +01:00
Florian Hahn	0c9c58ada0	[SCCP] Use constant ranges for casts. For casts with constant range operands, we can use ConstantRange::castOp. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71938	2020-03-31 09:22:04 +01:00
Sameer Sahasrabuddhe	3cbbded68c	Introduce unify-loop-exits pass. For each natural loop with multiple exit blocks, this pass creates a new block N such that all exiting blocks now branch to N, and then control flow is redistributed to all the original exit blocks. The bulk of the tranformation is a new function introduced in BasicBlockUtils that an redirect control flow from a set of incoming blocks to a set of outgoing blocks via a common "hub". This is a useful workaround for a limitation in the structurizer which incorrectly orders blocks when processing a nest of loops. This pass bypasses that issue by ensuring that each natural loop is recognized as a separate region. Since the structurizer is a region pass, it no longer sees a nest of loops in a single region, and instead processes each "level" in the nesting as a separate region. The AMDGPU backend provides a new option to enable this pass before the structurizer, which may eventually be enabled by default. Reviewers: madhur13490, arsenm, nhaehnle Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D75865	2020-03-30 13:23:56 -04:00
Chris Jackson	135709aa90	[DebugInfo] Ensure dead store elimination can mark an operand value as undefined - Correct a debug info salvage and add a test Reviewers: aprantl, vsk Differential Revision: https://reviews.llvm.org/D76930 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=45080	2020-03-30 14:58:14 +01:00
Florian Hahn	9e81249d76	[Matrix] Rename emitChainedMatrixMultiply to emitMatrixMultiply (NFC). The Chained in the name potentially leads to confusion. Also updated the comment to drop the unnecessary mention of tile-sized.	2020-03-30 11:17:25 +01:00
Richard Diamond	4bf015c035	[AlignmentFromAssumptions] Fix a SCEV assertion resulting from address space differences. Summary: On targets with different pointer sizes, -alignment-from-assumptions could attempt to create SCEV expressions which use different effective SCEV types. The provided test illustrates the issue. In `getNewAlignment`, AASCEV would be the (only) alloca, which would have an effective SCEV type of i32. But PtrSCEV, the GEP in this case, due to being in the flat/default address space, will have an effective SCEV of i64. This patch resolves the issue by truncating PtrSCEV to AASCEV's effective type. Reviewers: hfinkel, jdoerfert Reviewed By: jdoerfert Subscribers: jvesely, nhaehnle, hiraditya, javed.absar, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75471	2020-03-29 01:26:31 -05:00
Enna1	03bc311a16	[CorrelatedValuePropagation] Remove redundant if statement in processSelect() This statement if (ReplaceWith == S) ReplaceWith = UndefValue::get(S->getType()); is introduced in https://reviews.llvm.org/rG35609d97ae89b8e13f40f4e6b9b056954f8baa83 to fix a case where unreachable code can cause select instruction simplification to fail. In https://reviews.llvm.org/rGd10480657527ffb44ea213460fb3676a6b1300aa, we begin to perform a depth-first walk of basic blocks. This means we will not visit unreachable blocks. So we do not need this the special check any more. Differential Revision: https://reviews.llvm.org/D76753	2020-03-28 18:01:17 +01:00
Florian Hahn	81f173ed0e	[SCCP] Remove LatticeVal alias now that transition is done (NFC). The LatticeVal alias was introduced to reduce the diff size for the transition to ValueLatticeElement, which is done now. This patch removes the unnecessary alias and updates some very verbose type uses with auto.	2020-03-28 15:40:24 +00:00
Florian Hahn	a44bf59c93	[SCCP] Remove unused toLatticeValue helper (NFC). LatticeVal is an alias for ValueLatticeElement and the function is not used any longer.	2020-03-28 15:40:24 +00:00
Juneyoung Lee	49f75132bc	[DivRemPairs] Freeze operands if they can be undef values Summary: DivRemPairs is unsound with respect to undef values. ``` // bb1: // %rem = srem %x, %y // bb2: // %div = sdiv %x, %y // --> // bb1: // %div = sdiv %x, %y // %mul = mul %div, %y // %rem = sub %x, %mul ``` If X can be undef, X should be frozen first. For example, let's assume that Y = 1 & X = undef: ``` %div = sdiv undef, 1 // %div = undef %rem = srem undef, 1 // %rem = 0 => %div = sdiv undef, 1 // %div = undef %mul = mul %div, 1 // %mul = undef %rem = sub %x, %mul // %rem = undef - undef = undef ``` http://volta.cs.utah.edu:8080/z/m7Xrx5 Same for Y. If X = 1 and Y = (undef \| 1), %rem in src is either 1 or 0, but %rem in tgt can be one of many integer values. This resolves https://bugs.llvm.org/show_bug.cgi?id=42619 . This miscompilation disappears if undef value is removed, but it may take a while. DivRemPair happens pretty late during the optimization pipeline, so this optimization seemed as a good candidate to fix without major regression using freeze than other broken optimizations. Reviewers: spatel, lebedev.ri, george.burgess.iv Reviewed By: spatel Subscribers: wuzish, regehr, nlopes, nemanjai, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76483	2020-03-25 03:46:14 +09:00
Matt Arsenault	b20a1d840f	GVNSink: Allow handling addrspacecast	2020-03-23 16:50:58 -04:00
Matt Arsenault	43d98a0ecf	Allow replacing intrinsic operands with variables Since intrinsics can now specify when an argument is required to be constant, it is now OK to replace arguments with variables if they aren't. This means intrinsics must now be accurately marked with immarg.	2020-03-23 15:51:57 -04:00
Florian Hahn	be86bc76f0	[Matrix] Generalize ColumnMatrixTy to MatrixTy (NFC). This patch sets the stage for supporting both row and column major layouts for matrixes. It renames ColumnMatrixTy to MatrixTy, adds booleans indicating the underlying layout to both MatrixTy and ShapeInfo and generalizes the methods of MatrixTy to support both row and column major layouts. Reviewers: Gerolf, anemet, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D76324	2020-03-20 08:32:13 +00:00
Florian Hahn	3a8372ed02	[DSE] Support traversing MemoryPhis. For MemoryPhis, we have to avoid that the MemoryPhi may be executed before before the access we are currently looking at. To do this we do a post-order numbering of the basic blocks in the function and bail out once we reach a MemoryPhi with a larger (or equal) post-order block number than the current MemoryAccess. This changes the order in which we visit stores for elimination. This patch also adds support for exploring multiple paths. We keep a worklist (ToCheck) of memory accesses that might be eliminated by our starting MemoryDef or MemoryPhis for further exploration. For MemoryPhis, we add the incoming values to the worklist, for MemoryDefs we add the defining access. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72148	2020-03-20 07:51:42 +00:00
Benjamin Kramer	1db8b341a6	[Matrix] Fold single-use variable into assert Avoids -Wunused-variable warnings in Release builds.	2020-03-19 21:42:22 +01:00
Florian Hahn	796fb2e474	[Matrix] Move multiply-add code generation into separate function (NFC). This logic can be shared with the tiled code generation. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75565	2020-03-19 20:26:19 +00:00
Kazu Hirata	e23d786526	[JumpThreading] Fix infinite loop (PR44611) Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44611 by preventing an infinite loop in the jump threading pass when -jump-threading-across-loop-headers is on. Specifically, without this patch, jump threading through two basic blocks would trigger on the same area of the CFG over and over, resulting in an infinite loop. Consider testcase PR44611-across-header-hang.ll in this patch. The first opportunity to thread through two basic blocks is: from bb_body2 through bb_header and bb_body1 to bb_body2. The pass duplicates bb_header and bb_body1 as, say, bb_header.thread1 and bb_body1.thread1. Since bb_header contains a successor edge back to itself, bb_header.thread1 also contains a successor edge to bb_header, immediately giving rise to the next jump threading opportunity: from bb_header.thread1 through bb_header and bb_body1 to bb_body2. After that, we repeatedly thread an incoming edge into bb_header through bb_header and bb_body1 to bb_body2. In other words, we keep peeling one iteration from bb_header's self loop. The patch fixes the problem by preventing the pass from duplicating a basic block containing a self loop. Reviewers: wmi, junparser, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76390	2020-03-19 12:49:36 -07:00
Florian Hahn	0cc2d23751	[Matrix] Hoist load/store generation logic, add helpers for tiled access. This patch slightly generalizes the code to emit loads and stores of a matrix and adds helpers to load/store a tile of a larger matrix. This will be used in a follow-up patch introducing initial tiling. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D75564	2020-03-19 19:28:21 +00:00
Florian Hahn	4a58996dd2	[SCCP] Use constant ranges for PHI nodes. For PHIs with multiple incoming values, we can improve precision by using constant ranges for integers. We can over-approximate phis by merging the incoming values. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71933	2020-03-19 12:45:33 +00:00
Florian Hahn	8a36594a7e	[SCCP] Use constant ranges for binary operators. If one of the operands of a binary operator is a constant range, we can use ConstantRange::binaryOp to approximate the result. We still handle single element constant ranges as we did previously, with ConstantExpr::get(), because ConstantRange::binaryOp still gives worse results in a few cases for single element ranges. Also note that we bail out early if any of the operands is still unknown. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71936	2020-03-19 09:35:48 +00:00
Florian Hahn	5672ae8d86	[SCCP] Use constant ranges for select, if cond is overdefined. For selects with an unknown condition, we can approximate the result by merging the state of both options. This automatically takes care of the case where on operand is undef. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71935	2020-03-18 09:26:02 +00:00
Michael Liao	f2f8bdc2b1	Fix `-Wunused-variable` warning. NFC.	2020-03-17 20:15:50 -04:00
Florian Hahn	a72ae99cf9	[SCCP] Split up callsite handling, only propagate result on change (NFC) Functions include their arguments in the use-list. Changed function values mean that the result of the function changed. We only need to update the call sites with the new function result and do not have to propagate the call arguments. To do so, this patch splits up the visitCallSite into handleCallResult and handleCallArguments and updates markUsersAsChanged to only update call results for functions. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75846	2020-03-17 20:05:35 +00:00
Florian Hahn	1d6f919df2	[SCCP] Explicitly mark values as overdefined (NFC). This was part of D60582 but can be committed separately.	2020-03-17 12:13:30 +00:00
Florian Hahn	4878aa36d4	[ValueLattice] Add new state for undef constants. This patch adds a new undef lattice state, which is used to represent UndefValue constants or instructions producing undef. The main difference to the unknown state is that merging undef values with constants (or single element constant ranges) produces the constant/constant range, assuming all uses of the merge result will be replaced by the found constant. Contrary, merging non-single element ranges with undef needs to go to overdefined. Using unknown for UndefValues currently causes mis-compiles in CVP/LVI (PR44949) and will become problematic once we use ValueLatticeElement for SCCP. Reviewers: efriedma, reames, davide, nikic Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D75120	2020-03-14 17:19:59 +00:00
Whitney Tsang	aca7167535	[NFC][LoopUnrollAndJam] clang-format. I am currently working on this file.	2020-03-14 00:04:10 +00:00
Alexey Zhikhartsev	f71abec661	[LoopInterchange] Fix interchanging contents of preheader BBs Summary: Previously LCSSA was getting broken by placing instructions into the (newly) inner header instead of the preheader. Fixes PR43474 Reviewers: fhahn Reviewed By: fhahn Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75943	2020-03-13 15:59:37 -04:00
Florian Hahn	0c5b6e2ea5	Recommit "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This patch should fix the cause of the stage2 failures and PR45185. This reverts the revert commit `c52f839e72`.	2020-03-13 17:03:22 +00:00
Florian Hahn	c52f839e72	Revert "[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI)" This commit is likely causing clang-with-lto-ubuntu to fail http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/16052 Also causes PR45185. This reverts commit `f1ac5d2263`.	2020-03-12 18:49:11 +00:00
Florian Hahn	f1ac5d2263	[SCCP] Use ValueLatticeElement instead of LatticeVal (NFCI) This patch switches SCCP to use ValueLatticeElement for lattice values, instead of the local LatticeVal, as first step to enable integer range support. This patch does not make use of constant ranges for additional operations and the only difference for now is that integer constants are represented by single element ranges. To preserve the existing behavior, the following helpers are used * isConstant(LV): returns true when LV is either a constant or a constant range with a single element. This should return true in the same cases where LV.isConstant() returned true previously. * getConstant(LV): returns a constant if LV is either a constant or a constant range with a single element. This should return a constant in the same cases as LV.getConstant() previously. * getConstantInt(LV): same as getConstant, but additionally casted to ConstantInt. Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60582	2020-03-12 12:03:06 +00:00
Florian Hahn	bc6c8c4bbb	[Matrix] Add remark propagation along the inlined-at chain. This patch adds support for propagating matrix expressions along the inlined-at chain and emitting remarks at the traversed function scopes. To motivate this new behavior, consider the example below. Without the remark 'up-leveling', we would only get remarks in load.h and store.h, but we cannot generate a remark describing the full expression in toplevel.cpp, which is the place where the user has the best chance of spotting/fixing potential problems. With this patch, we generate a remark for the load in load.h, one for the store in store.h and one for the complete expression in toplevel.cpp. For a bigger example, please see remarks-inlining.ll. load.h: template <typename Ty, unsigned R, unsigned C> Matrix<Ty, R, C> load(Ty Ptr) { Matrix<Ty, R, C> Result; Result.value = reinterpret_cast <typename Matrix<Ty, R, C>::matrix_t >(Ptr); return Result; } store.h: template <typename Ty, unsigned R, unsigned C> void store(Matrix<Ty, R, C> M1, Ty Ptr) { reinterpret_cast<typename decltype(M1)::matrix_t >(Ptr) = M1.value; } toplevel.cpp void test(double A, double B, double *C) { store(add(load<double, 3, 5>(A), load<double, 3, 5>(B)), C); } For a given function, we traverse the inlined-at chain for each matrix instruction (= instructions with shape information). We collect the matrix instructions in each DISubprogram we visit. This produces a mapping of DISubprogram -> (List of matrix instructions visible in the subpogram). We then generate remarks using the list of instructions for each subprogram in the inlined-at chain. Note that the list of instructions for a subprogram includes the instructions from its own subprograms recursively. For example using the example above, for the subprogram 'test' this includes inline functions 'load' and 'store'. This allows surfacing the remarks at a level useful to users. Please note that the current approach may create a lot of extra remarks. Additional heuristics to cut-off the traversal can be implemented in the future. For example, it might make sense to stop 'up-leveling' once all matrix instructions are at the same debug location. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D73600	2020-03-11 17:40:08 +00:00
Benjamin Kramer	247a177cf7	Give helpers internal linkage. NFC.	2020-03-10 18:27:42 +01:00
Jonas Paulsson	c2dafe12dc	[SimplifyCFG] Skip merging return blocks if it would break a CallBr. SimplifyCFG should not merge empty return blocks and leave a CallBr behind with a duplicated destination since the verifier will then trigger an assert. This patch checks for this case and avoids the transformation. CodeGenPrepare has a similar check which also has a FIXME comment about why this is needed. It seems perhaps better if these two passes would eventually instead update the CallBr instruction instead of just checking and avoiding. This fixes https://bugs.llvm.org/show_bug.cgi?id=45062. Review: Craig Topper Differential Revision: https://reviews.llvm.org/D75620	2020-03-10 14:59:13 +01:00
Andrew Monshizadeh	c5a06019d2	Extend TimeTrace to LLVM's new pass manager With the addition of the LLD time tracing it made sense to include coverage for LLVM's various passes. Doing so ensures that ThinLTO is also covered with a time trace. Before: {F11333974} After: {F11333928} Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D74516	2020-03-06 14:45:19 -08:00
Anna Thomas	59029b9eef	[RS4GC] Handle uses of extractelement for conversion from vector to scalar base As mentioned in the comments, extractelement is special since we actually want a scalar base for that element we extracted from the vector (i.e. not a vector base). This same logic should apply to uses of the extractelement such as phis and selects which have the same BDV as the extractelement. Howeber, for these uses we conservatively mark the BDV state as conflict, since setting the EE's new base BDV does not always dominate these uses. Added testcase showcases the problem where the BDV identification chokes on the incorrect cast from vector to scalar for the phi use of extractelement. Tests-Run: make check, internal fuzzer testing Reviewers: reames, skatkov, dantrushin Reviewed-By: dantrushin Differential Revision: https://reviews.llvm.org/D75704	2020-03-06 16:28:49 -05:00
Jay Foad	11d1573bb6	[APFloat] Make use of new overloaded comparison operators. NFC. Reviewers: ekatz, spatel, jfb, tlively, craig.topper, RKSimon, nikic, scanon Subscribers: arsenm, jvesely, nhaehnle, hiraditya, dexonsmith, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75744	2020-03-06 16:42:53 +00:00
Zhongduo Lin	eae228a292	[IndVarSimplify] Extend previous special case for load use instruction to any narrow type loop variant to avoid extra trunc instruction Summary: The widenIVUse avoids generating trunc by evaluating the use as AddRec, this will not work when: 1) SCEV traces back to an instruction inside the loop that SCEV can not expand, eg. add %indvar, (load %addr) 2) SCEV finds a loop variant, eg. add %indvar, %loopvariant While SCEV fails to avoid trunc, we can still try to use instruction combining approach to prove trunc is not required. This can be further extended with other instruction combining checks, but for now we handle the following case (sub can be "add" and "mul", "nsw + sext" can be "nus + zext") ``` Src: %c = sub nsw %b, %indvar %d = sext %c to i64 Dst: %indvar.ext1 = sext %indvar to i64 %m = sext %b to i64 %d = sub nsw i64 %m, %indvar.ext1 ``` Therefore, as long as the result of add/sub/mul is extended to wide type with right extension and overflow wrap combination, no trunc is required regardless of how %b is generated. This pattern is common when calculating address in 64 bit architecture. Note that this patch reuse almost all the code from D49151 by @az: https://reviews.llvm.org/D49151 It extends it by providing proof of why trunc is unnecessary in more general case, it should also resolve some of the concerns from the following discussion with @reames. http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180910/585945.html Reviewers: sanjoy, efriedma, sebpop, reames, az, javed.absar, amehsan Reviewed By: az, amehsan Subscribers: hiraditya, llvm-commits, amehsan, reames, az Tags: #llvm Differential Revision: https://reviews.llvm.org/D73059	2020-03-05 16:27:59 -05:00
Sameer Sahasrabuddhe	42febbab91	StructurizeCFG: simplify phi nodes when possible After structurization, some phi nodes can have a single incoming edge and can be simplified away. This change runs a simplify query on all phis that are either modified or added by the structurizer. This also moves some phis closer to their use as a side benefit. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D75500	2020-03-05 10:33:15 +05:30
David Green	38e532278e	[LSR] Add masked load and store handling This teaches Loop Strength Reduction the details about masked load and store address operands, so that it can have a better time optimising them as it would for normal loads and stores. Differential Revision: https://reviews.llvm.org/D75371	2020-03-04 18:36:10 +00:00
Matt Arsenault	f9047ede58	LICM: Reorder condition checks Check the fast math flag before the more expensive loop check.	2020-03-03 17:15:57 -05:00
Juneyoung Lee	9f1f244d3c	[LICM] Allow freeze to hoist/sink out of a loop Summary: This patch allows LICM to hoist/sink freeze instructions out of a loop. Reviewers: reames, fhahn, efriedma Reviewed By: reames Subscribers: jfb, lebedev.ri, hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75400	2020-03-03 12:29:39 +09:00
Sumanth Gundapaneni	9897daa6bf	Update LSR's logic that identifies a post-increment SCEV value. One of the checks has been removed as it seem invalid. The LoopStep size is always almost a 32-bit. Differential Revision: https://reviews.llvm.org/D75079	2020-03-02 16:34:18 -06:00
Arkady Shlykov	3dcaf296ae	[Loop Peeling] Add possibility to enable peeling on loop nests. Summary: Current peeling implementation bails out in case of loop nests. The patch introduces a field in TargetTransformInfo structure that certain targets can use to relax the constraints if it's profitable (disabled by default). Also additional option is added to enable peeling manually for experimenting and testing purposes. Reviewers: fhahn, lebedev.ri, xbolva00 Reviewed By: xbolva00 Subscribers: RKSimon, xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D70304	2020-03-02 08:37:11 -08:00
Juneyoung Lee	5cbb265694	[GVN] Fold equivalent freeze instructions Summary: This patch defines two freeze instructions to have the same value number if they are equivalent. This is allowed because GVN replaces all uses of a duplicated instruction with another. If it partially rewrites use, it is not allowed. e.g) ``` a = freeze(x) b = freeze(x) use(a) use(a) use(b) => use(a) use(b) // This is not allowed! use(b) ``` Reviewers: fhahn, reames, spatel, efriedma Reviewed By: fhahn Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75398	2020-03-01 07:32:05 +09:00
Pierre-vh	2809abbd98	[Transform][MemCpyOpt] Add missing DebugLoc to %tmpbitcast Fix for https://bugs.llvm.org/show_bug.cgi?id=37967 Differential Revision: https://reviews.llvm.org/D75173	2020-02-28 15:20:51 +00:00
Juneyoung Lee	cc28a75467	Let EarlyCSE fold equivalent freeze instructions Summary: This patch makes EarlyCSE fold equivalent freeze instructions. Another optimization that I think will be useful is to remove freeze if its operand is used as a branch condition or at llvm.assume: ``` %c = ... br i1 %c, label %A, .. A: %d = freeze %c ; %d can be optimized to %c because %c cannot be poison or undef (or 'br %c' would be UB otherwise) ``` If it make sense for EarlyCSE to support this as well, I will make a patch for this. Reviewers: spatel, reames, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75334	2020-02-28 20:35:20 +09:00
Hans Wennborg	d48c981697	SROA: Don't drop atomic load/store alignments (PR45010) SROA will drop the explicit alignment on allocas when the ABI guarantees enough alignment. Because the alignment on new load/store instructions are set based on the alloca's alignment, that means SROA would end up dropping the alignment from atomic loads and stores, which is not allowed (see bug). For those, make sure to always carry over the alignment from the previous instruction. Differential revision: https://reviews.llvm.org/D75266	2020-02-28 10:38:40 +01:00
Juneyoung Lee	2b5a897651	Revert "[SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison" .. due to performance regression. This patch is reverted until infrastructore for CSE/LICM support for freeze is added. This reverts commit `181628b`	2020-02-28 11:10:46 +09:00
Eli Friedman	b299926453	[IndVars] Fix sort comparator. std::sort will compare an element to itself in some cases. We should not crash if this happens. Differential Revision: https://reviews.llvm.org/D75000	2020-02-27 17:25:18 -08:00
Artur Pilipenko	02e3d5c3a2	Fix DSE miscompile when store is clobbered across loop iterations DSE would mistakenly remove store (2): a = calloc(n+1) for (int i = 0; i < n; i++) { store 1, a[i+1] // (1) store 0, a[i] // (2) } The fix is to do PHI transaltion while looking for clobbering instructions between the store and the calloc. Reviewed By: efriedma, bjope Differential Revision: https://reviews.llvm.org/D68006	2020-02-27 14:43:01 -08:00
Simon Moll	ddd11273d9	Remove BinaryOperator::CreateFNeg Use UnaryOperator::CreateFNeg instead. Summary: With the introduction of the native fneg instruction, the fsub -0.0, %x idiom is obsolete. This patch makes LLVM emit fneg instead of the idiom in all places. Reviewed By: cameron.mcinally Differential Revision: https://reviews.llvm.org/D75130	2020-02-27 09:06:03 -08:00
Nikita Popov	00f54050f7	[SimpleLoopUnswitch] Remove unnecessary include; NFC	2020-02-26 20:40:43 +01:00
Nikita Popov	9d9633fb70	[CVP] Simplify cmp of local phi node CVP currently does not simplify cmps with instructions in the same block, because LVI getPredicateAt() currently does not provide much useful information for that case (D69686 would change that, but is stuck.) However, if the instruction is a Phi node, then LVI can compute the result of the predicate by threading it into the predecessor blocks, which allows it simplify some conditions that nothing else can handle. Relevant code: `6d6a4590c5/llvm/lib/Analysis/LazyValueInfo.cpp (L1904-L1927)` Differential Revision: https://reviews.llvm.org/D72169	2020-02-26 20:36:41 +01:00
Juneyoung Lee	1cb7ec870d	[SimpleLoopUnswitch] Canonicalize variable names	2020-02-26 15:33:02 +09:00
Juneyoung Lee	181628b52d	[SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison Summary: Loop unswitch hoists branches on loop-invariant conditions. However, if this condition is poison/undef and the branch wasn't originally reachable, loop unswitch introduces UB (since the optimized code will branch on poison/undef and the original one didn't)). We fix this problem by freezing the condition to ensure we don't introduce UB. We will now transform the following: while (...) { if (C) { A } else { B } } Into: C' = freeze(C) if (C') { while (...) { A } } else { while (...) { B } } This patch fixes the root cause of the following bug reports (which use the old loop unswitch, but can be reproduced with minor changes in the code and -enable-nontrivial-unswitch): - https://llvm.org/bugs/show_bug.cgi?id=27506 - https://llvm.org/bugs/show_bug.cgi?id=31652 Reviewers: reames, majnemer, chenli, sanjoy, hfinkel Reviewed By: reames Subscribers: hiraditya, jvesely, nhaehnle, filcab, regehr, trentxintong, nlopes, llvm-commits, mzolotukhin Tags: #llvm Differential Revision: https://reviews.llvm.org/D29015	2020-02-26 13:47:33 +09:00
Roman Lebedev	400ceda425	[SCEV][IndVars] Always provide insertion point to the SCEVExpander::isHighCostExpansion() Summary: This addresses the `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` regression from D73728 Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73777	2020-02-25 23:05:59 +03:00
Roman Lebedev	b99c91a087	[NFC][SCEV] Piping to pass new SCEVCheapExpansionBudget option into SCEVExpander::isHighCostExpansionHelper() Summary: In future patches`SCEVExpander::isHighCostExpansionHelper()` will respect the budget allocated by performing TTI cost modelling. This is a fully NFC patch to make things reviewable. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, zzheng, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73705	2020-02-25 23:05:57 +03:00
Roman Lebedev	0789f28048	[NFC][SCEV] Piping to pass TTI into SCEVExpander::isHighCostExpansionHelper() Summary: Future patches will make use of TTI to perform cost-model-driven `SCEVExpander::isHighCostExpansionHelper()` This is a fully NFC patch to make things reviewable. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, zzheng, javed.absar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73704	2020-02-25 23:05:56 +03:00
Philip Reames	14845b2c45	Revert "[LICM] Support hosting of dynamic allocas out of loops" This reverts commit `8d22100f66`. There was a functional regression reported (https://bugs.llvm.org/show_bug.cgi?id=44996). I'm not actually sure the patch is wrong, but I don't have time to investigate currently, and this line of work isn't something I'm likely to get back to quickly.	2020-02-25 09:05:31 -08:00
Florian Hahn	b8d638d337	[DSE,MSSA] Do not attempt to remove un-removable memdefs. We have to skip MemoryDefs that cannot be removed. This fixes a crash in the newly added test case and fixes a wrong case in memset-and-memcpy.ll.	2020-02-25 13:31:46 +00:00
Florian Hahn	af69d5e10e	[DSE] Track overlapping stores. Add a map from BasicBlocks to overlap intervals. For partial writes, we can keep track of those in IOLs. We only add candidates that are valid for eliminations. Reviewers: dmgreen, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D73757	2020-02-23 15:44:40 +00:00
Florian Hahn	134bab7cd5	[DSE,MSSA] Add debug counter. Can be used like -debug-counter=dse-memoryssa-skip=10,dse-memoryssa-counter-count=20 Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72147	2020-02-21 17:04:37 +00:00
Bill Wendling	2fe457690d	Filter callbr insts from critical edge splitting Similarly to how splitting predecessors with an indirectbr isn't handled in the generic way, we also shouldn't split callbrs, for similar reasons.	2020-02-20 16:24:42 -08:00
Florian Hahn	99809f98d7	[SCCP] Do not mark unknown loads as overdefined. For tracked globals that are unknown after solving, we expect all non-store uses to be replaced. This is a follow-up to `f8045b250d`, which removed forcedconstant. We should not mark unknown loads as overdefined, as they either load from an unknown pointer or an undef global. Restore the original logic for loads.	2020-02-20 22:48:58 +01:00
dfukalov	dbfc682e2b	SpeculativeExecution: fixed ingoring free execution Summary: After updating cost model in AMDGPU target (`47a5c36b37`) the pass started to ignore some BBs since they got all instructions estimated as free. Reviewers: arsenm, chandlerc, nhaehnle Reviewed By: nhaehnle Subscribers: jvesely, wdng, nhaehnle, tpr, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74825	2020-02-20 14:45:02 +03:00
Michael Kruse	e4d20ec8ad	[IndVarSimply] Fix assert/release build difference. In builds with assertions enabled (!NDEBUG), IndVarSimplify does an additional query to ScalarEvolution which may change future SCEV queries since it fills the internal cache differently. The result is actually only used with the -verify-indvars command line option. We fix the issue by only calling SE->getBackedgeTakenCount(L) if -verify-indvars is enabled such that only -verify-indvars shows the behavior, but not debug builds themselves. Also add a remark to the description of -verify-indvars about this behavior. Fixes llvm.org/PR44815 Differential Revision: https://reviews.llvm.org/D74810	2020-02-19 14:36:22 -06:00
Reid Kleckner	0c2b09a9b6	[IR] Lazily number instructions for local dominance queries Essentially, fold OrderedBasicBlock into BasicBlock, and make it auto-invalidate the instruction ordering when new instructions are added. Notably, we don't need to invalidate it when removing instructions, which is helpful when a pass mostly delete dead instructions rather than transforming them. The downside is that Instruction grows from 56 bytes to 64 bytes. The resulting LLVM code is substantially simpler and automatically handles invalidation, which makes me think that this is the right speed and size tradeoff. The important change is in SymbolTableTraitsImpl.h, where the numbering is invalidated. Everything else should be straightforward. We probably want to implement a fancier re-numbering scheme so that local updates don't invalidate the ordering, but I plan for that to be future work, maybe for someone else. Reviewed By: lattner, vsk, fhahn, dexonsmith Differential Revision: https://reviews.llvm.org/D51664	2020-02-18 14:44:24 -08:00
Fangrui Song	13a97305ba	[JumpThreading] Skip unconditional PredBB when threading jumps through two basic blocks Fixes https://bugs.llvm.org/show_bug.cgi?id=44922 (caused by `4698bf145d`) ThreadThroughTwoBasicBlocks assumes PredBBBranch is conditional. The following code can segfault. AddPHINodeEntriesForMappedBlock(PredBBBranch->getSuccessor(1), PredBB, NewBB, ValueMapping); We can also allow unconditional PredBB, but the produced code is not better. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D74747	2020-02-18 11:01:46 -08:00
Nicolai Hähnle	58297e4d8f	LowerMatrixIntrinsics: Avoid use of deprecated CreateCall methods Reviewers: t.p.northover Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74675	2020-02-18 00:24:09 +01:00
Nikita Popov	3eaa53e805	Reapply "[IRBuilder] Virtualize IRBuilder" Relative to the original commit, this fixes some warnings, and is based on the deletion of the IRBuilder copy constructor in D74693. The automatic copy constructor would no longer be safe. ----- Related llvm-dev thread: http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html This patch moves the IRBuilder from templating over the constant folder and inserter towards making both of these virtual. There are a couple of motivations for this: 1. It's not possible to share code between use-sites that use different IRBuilder folders/inserters (short of templating the code and moving it into headers). 2. Methods currently defined on IRBuilderBase (which is not templated) do not use the custom inserter, resulting in subtle bugs (e.g. incorrect InstCombine worklist management). It would be possible to move those into the templated IRBuilder, but... 3. The vast majority of the IRBuilder implementation has to live in the header, because it depends on the template arguments. 4. We have many unnecessary dependencies on IRBuilder.h, because it is not easy to forward-declare. (Significant parts of the backend depend on it via TargetLowering.h, for example.) This patch addresses the issue by making the following changes: * IRBuilderDefaultInserter::InsertHelper becomes virtual. IRBuilderBase accepts a reference to it. * IRBuilderFolder is introduced as a virtual base class. It is implemented by ConstantFolder (default), NoFolder and TargetFolder. IRBuilderBase has a reference to this as well. * All the logic is moved from IRBuilder to IRBuilderBase. This means that methods can in the future replace their IRBuilder<> & uses (or other specific IRBuilder types) with IRBuilderBase & and thus be usable with different IRBuilders. * The IRBuilder class is now a thin wrapper around IRBuilderBase. Essentially it only stores the folder and inserter and takes care of constructing the base builder. What this patch doesn't do, but should be simple followups after this change: * Fixing use of the inserter for creation methods originally defined on IRBuilderBase. * Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful. * Moving code from the IRBuilder header to the source file. From the user perspective, these changes should be mostly transparent: The only thing that consumers using a custom inserted may need to do is inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper as public. Differential Revision: https://reviews.llvm.org/D73835	2020-02-17 19:04:11 +01:00
Nikita Popov	5f7b92b1b4	[IRBuilder] Prefer InsertPointGuard over full copy; NFC Don't copy the IRBuilder when an InsertPointGuard would also do.	2020-02-16 18:02:29 +01:00
Nikita Popov	7c362b25d7	[IRBuilder] Fix unnecessary IRBuilder copies; NFC Fix a few cases where an IRBuilder is passed to a helper function by value, while a by reference pass was intended.	2020-02-16 17:57:18 +01:00
Nikita Popov	af480e8c63	Revert "[IRBuilder] Virtualize IRBuilder" This reverts commit `0765d3824d`. This reverts commit `1b04866a3d`. Relevant looking crashes observed on: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win	2020-02-16 17:01:10 +01:00
Nikita Popov	1b04866a3d	[IRBuilder] Try to fix warnings Try to fix -Wnon-virtual-dtor warnings that cause build failure on clang-pcc64le-rhel.	2020-02-16 15:32:11 +01:00
Nikita Popov	0765d3824d	[IRBuilder] Virtualize IRBuilder Related llvm-dev thread: http://lists.llvm.org/pipermail/llvm-dev/2020-February/138951.html This patch moves the IRBuilder from templating over the constant folder and inserter towards making both of these virtual. There are a couple of motivations for this: 1. It's not possible to share code between use-sites that use different IRBuilder folders/inserters (short of templating the code and moving it into headers). 2. Methods currently defined on IRBuilderBase (which is not templated) do not use the custom inserter, resulting in subtle bugs (e.g. incorrect InstCombine worklist management). It would be possible to move those into the templated IRBuilder, but... 3. The vast majority of the IRBuilder implementation has to live in the header, because it depends on the template arguments. 4. We have many unnecessary dependencies on IRBuilder.h, because it is not easy to forward-declare. (Significant parts of the backend depend on it via TargetLowering.h, for example.) This patch addresses the issue by making the following changes: * IRBuilderDefaultInserter::InsertHelper becomes virtual. IRBuilderBase accepts a reference to it. * IRBuilderFolder is introduced as a virtual base class. It is implemented by ConstantFolder (default), NoFolder and TargetFolder. IRBuilderBase has a reference to this as well. * All the logic is moved from IRBuilder to IRBuilderBase. This means that methods can in the future replace their IRBuilder<> & uses (or other specific IRBuilder types) with IRBuilderBase & and thus be usable with different IRBuilders. * The IRBuilder class is now a thin wrapper around IRBuilderBase. Essentially it only stores the folder and inserter and takes care of constructing the base builder. What this patch doesn't do, but should be simple followups after this change: * Fixing use of the inserter for creation methods originally defined on IRBuilderBase. * Replacing IRBuilder<> uses in arguments with IRBuilderBase, where useful. * Moving code from the IRBuilder header to the source file. From the user perspective, these changes should be mostly transparent: The only thing that consumers using a custom inserted may need to do is inherit from IRBuilderDefaultInserter publicly and mark their InsertHelper as public. Differential Revision: https://reviews.llvm.org/D73835	2020-02-16 13:48:55 +01:00
Florian Hahn	f8045b250d	Recommit "[SCCP] Remove forcedconstant, go to overdefined instead" This includes a fix for cases where things get marked as overdefined in ResolvedUndefsIn, but we later discover a constant. To avoid crashing, we consistently bail out on overdefined values in the visitors. This is similar to the previous behavior with forcedconstant. This reverts the revert commit `02b72f564c`.	2020-02-15 18:36:44 +01:00
Alina Sbirlea	1326a5a4cf	[LoopRotate] Get and update MSSA only if available in legacy pass manager. Summary: Potential fix for: https://bugs.llvm.org/show_bug.cgi?id=44889 and https://bugs.llvm.org/show_bug.cgi?id=44408 In the legacy pass manager, loop rotate need not compute MemorySSA when not being in the same loop pass manager with other loop passes. There isn't currently a way to differentiate between the two cases, so this attempts to limit the usage in LoopRotate to only update MemorySSA when the analysis is already available. The side-effect of this is that it will split the Loop pipeline. This issue does not apply to the new pass manager, where we have a flag specifying if all loop passes in that loop pass manager preserve MemorySSA. Reviewers: dmgreen, fedor.sergeev, nikic Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74574	2020-02-14 10:47:26 -08:00
Vedant Kumar	02b72f564c	Revert "Recommit "[SCCP] Remove forcedconstant, go to overdefined instead"" This reverts commit `bb310b3f73`. This breaks the stage2 ASan build, see: https://bugs.llvm.org/show_bug.cgi?id=44898 rdar://59431448	2020-02-13 11:55:18 -08:00
Florian Hahn	bb310b3f73	Recommit "[SCCP] Remove forcedconstant, go to overdefined instead" This version includes a fix for a set of crashes caused by marking values depending on a yet unknown & tracked call as overdefined. In some cases, we would later discover that the call has a constant result and try to mark a user of it as constant, although it was already marked as overdefined. Most instruction handlers bail out early if the instruction is already overdefined. But that is not necessary for CastInsts for example. By skipping values that depend on skipped calls, we resolve the crashes and also improve the precision in some cases (see resolvedundefsin-tracked-fn.ll). Note that we may not skip PHI nodes that may depend on a skipped call, but they can be safely marked as overdefined, as we bail out early if the PHI node is overdefined. This reverts the revert commit a74b31a3e9cd844c7ce2087978568e3f5ec8519.	2020-02-12 18:02:18 +00:00
Anh Tuyen Tran	a5b6480d05	[NFC] Remove extra headers included in Loop Unroll and LoopUnrollAndJam files Summary: This refactor patch removes some header files which are not needed and also add some to meet IWYU principles. Reviewers: rnk (Reid Kleckner), Meinersbur (Michael Kruse), dmgreen (Dave Green) Reviewed By: dmgreen (Dave Green), rnk (Reid Kleckner), Meinersbur (Michael Kruse) Subscribers: dmgreen (Dave Green), Whitney (Whitney Tsang), hiraditya (Aditya Kumar), zzheng (Z. Zheng), llvm-commits, LLVM Tag: LLVM Differential Revision: https://reviews.llvm.org/D73498	2020-02-12 17:57:56 +00:00
Alina Sbirlea	4f33a68973	Compute ORE, BPI, BFI in Loop passes. Summary: Passes ORE, BPI, BFI are not being preserved by Loop passes, hence it is incorrect to retrieve these passes as cached. This patch makes the loop passes in question compute a new instance. In some of these cases, however, it may be beneficial to change the Loop pass to a Function pass instead, similar to the change for LoopUnrollAndJam. Reviewers: chandlerc, dmgreen, jdoerfert, reames Subscribers: mehdi_amini, hiraditya, zzheng, steven_wu, dexonsmith, Whitney, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72891	2020-02-12 09:15:18 -08:00
Florian Hahn	81dbb6aec6	Recommit "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)." This includes a fix for the santizier failures. This reverts the revert commit `42f8b915eb`.	2020-02-12 14:17:50 +00:00
stozer	ffeb64db35	Reapply "[DebugInfo] Prevent explosion of debug intrinsics during jump threading" This reverts commit `6ded69f294`.	2020-02-12 12:39:54 +00:00
stozer	6ded69f294	Revert "[DebugInfo] Prevent explosion of debug intrinsics during jump threading" This reverts commit `fe6f6cd6b8`. Found test failure on several buildbots.	2020-02-12 11:48:00 +00:00
stozer	fe6f6cd6b8	[DebugInfo] Prevent explosion of debug intrinsics during jump threading This patch is a fix following the revert of `72ce759` (https://reviews.llvm.org/rG72ce759928e6dfee6a9efa310b966c19722352ba) and fixes the failure that it caused. The above patch failed on the Thread Sanitizer buildbot with an out of memory error. After an investigation, the cause was identified as an explosion in debug intrinsics while running the Jump Threading pass on ModuleMap.ll. The above patched prevented debug intrinsics from being dropped when their Basic Block was deleted due to being "empty". In this case, one of the functions in ModuleMap.ll had (after many optimization passes) a very large number of debug intrinsics representing a set of repeatedly inlined variables. Previously the vast majority of these were silently dropped during Jump Threading when their blocks were deleted, but as of the above patch they survived for longer, causing a large increase in the number of debug intrinsics. These intrinsics were then repeatedly cloned by the Jump Threading pass as edges were threaded, multiplying the intrinsic count further. The memory consumed by this process spiralled out of control, crashing the buildbot that uses TSan (which has an estimated 5-10x memory overhead compared to non-sanitized builds). This patch adds RemoveRedundantDbgInstrs to the Jump Threading pass, in order to reduce the number of debug intrinsics down to a manageable amount in cases where many intrinsics for the same variable end up bunched together contiguously, as in this case. Differential Revision: https://reviews.llvm.org/D73054	2020-02-12 11:22:54 +00:00
Florian Hahn	fa74b31a3e	Revert "[SCCP] Remove forcedconstant, go to overdefined instead" This causes a crash for the reproducer below enum { a }; enum b { c, d }; e; static _Bool g(struct f *h, enum b i) { i &&j(); return a; } static k(char h, enum b i) { _Bool l = g(e, i); l; } m(h) { k(h, c); g(h, d); } This reverts commit `aadb635e04`.	2020-02-12 09:41:19 +00:00
Florian Hahn	aadb635e04	[SCCP] Remove forcedconstant, go to overdefined instead This patch removes forcedconstant to simplify things for the move to ValueLattice, which includes constant ranges, but no forced constants. This patch removes forcedconstant and changes ResolvedUndefsIn to mark instructions with unknown operands as overdefined. This means we do not do simplifications based on undef directly in SCCP any longer, but this seems to hardly come up in practice (see stats below), presumably because InstCombine & others take care of most of the relevant folds already. It is still beneficial to keep ResolvedUndefIn, as it allows us delaying going to overdefined until we propagated all known information. I also built MultiSource, SPEC2000 and SPEC2006 and compared sccp.IPNumInstRemoved and sccp.NumInstRemoved. It looks like the impact is quite low: Tests: 244 Same hash: 238 (filtered out) Remaining: 6 Metric: sccp.IPNumInstRemoved Program base patch diff test-suite...arks/VersaBench/dbms/dbms.test 4.00 3.00 -25.0% test-suite...TimberWolfMC/timberwolfmc.test 38.00 34.00 -10.5% test-suite...006/453.povray/453.povray.test 158.00 155.00 -1.9% test-suite.../CINT2000/176.gcc/176.gcc.test 668.00 668.00 0.0% test-suite.../CINT2006/403.gcc/403.gcc.test 1209.00 1209.00 0.0% test-suite...arks/mafft/pairlocalalign.test 76.00 76.00 0.0% Tests: 244 Same hash: 238 (filtered out) Remaining: 6 Metric: sccp.NumInstRemoved Program base patch diff test-suite...arks/mafft/pairlocalalign.test 185.00 175.00 -5.4% test-suite.../CINT2006/403.gcc/403.gcc.test 2059.00 2056.00 -0.1% test-suite.../CINT2000/176.gcc/176.gcc.test 2358.00 2357.00 -0.0% test-suite...006/453.povray/453.povray.test 317.00 317.00 0.0% test-suite...TimberWolfMC/timberwolfmc.test 12.00 12.00 0.0% Reviewers: davide, efriedma, mssimpso Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61314	2020-02-11 15:24:15 +00:00
Kadir Cetinkaya	42f8b915eb	Revert "[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk)." This reverts commit `d0c4d4fe09`. Revert "[DSE,MSSA] Move more passing test cases from todo to simple.ll." This reverts commit `02266e64bb`. Revert "[DSE,MSSA] Adjust mda-with-dbg-values.ll to MSSA backed DSE." This reverts commit `74f03e4ff0`.	2020-02-11 15:34:48 +01:00
Sanjay Patel	b8ebc11f03	[EarlyCSE] avoid crashing when detecting min/max/abs patterns (PR41083) As discussed in PR41083: https://bugs.llvm.org/show_bug.cgi?id=41083 ...we can assert/crash in EarlyCSE using the current hashing scheme and instructions with flags. ValueTracking's matchSelectPattern() may rely on overflow (nsw, etc) or other flags when detecting patterns such as min/max/abs composed of compare+select. But the value numbering / hashing mechanism used by EarlyCSE intersects those flags to allow more CSE. Several alternatives to solve this are discussed in the bug report. This patch avoids the issue by doing simple matching of min/max/abs patterns that never requires instruction flags. We give up some CSE power because of that, but that is not expected to result in much actual performance difference because InstCombine will canonicalize these patterns when possible. It even has this comment for abs/nabs: /// Canonicalize all these variants to 1 pattern. /// This makes CSE more likely. (And this patch adds PhaseOrdering tests to verify that the expected transforms are still happening in the standard optimization pipelines. I left this code to use ValueTracking's "flavor" enum values, so we don't have to change the callers' code. If we decide to go back to using the ValueTracking call (by changing the hashing algorithm instead), it should be obvious how to replace this chunk. Differential Revision: https://reviews.llvm.org/D74285	2020-02-10 17:25:34 -05:00
Florian Hahn	d0c4d4fe09	[DSE] Add first version of MemorySSA-backed DSE (Bottom up walk). This patch adds a first version of a MemorySSA based DSE. It is missing a lot of features, which will get added as follow-ups, to help to keep the review manageable. The patch uses the following general approach: given a MemoryDef, walk upwards to find clobbering MemoryDefs that may be killed by the starting def. Then check that there are no uses that may read the location of the original MemoryDef in between both MemoryDefs. A bit more concretely: For all MemoryDefs StartDef: 1. Get the next dominating clobbering MemoryDef (DomAccess) by walking upwards. 2. Check that there no reads between DomAccess and the StartDef by checking all uses starting at DomAccess and walking until we see StartDef. 3. For each found DomDef, check that: 1. There are no barrier instructions between DomDef and StartDef (like throws or stores with ordering constraints). 2. StartDef is executed whenever DomDef is executed. 3. StartDef completely overwrites DomDef. 4. Erase DomDef from the function and MemorySSA. The patch uses a very simple approach to guarantee that no throwing instructions are between 2 stores: We only allow accesses to stack objects, access that are in the same basic block if the block does not contain any throwing instructions or accesses in functions that do not contain any throwing instructions. This will get lifted later. Besides adding support for the missing cases, there is plenty of additional potential for improvements as follow-up work, e.g. the way we visit stores (could be just a traversal of the MemorySSA, rather than collecting them up-front), using the alias information discovered during walking to optimize the MemorySSA. This is loosely based on D40480 by Dave Green. Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D72700	2020-02-10 11:52:11 +00:00
Florian Hahn	da52b9c118	[DSE] Add tests for MemorySSA based DSE. This copies the DSE tests into a MSSA subdirectory to test the MemorySSA backed DSE implementation, without disturbing the original tests. Differential Revision: https://reviews.llvm.org/D72145	2020-02-10 10:28:43 +00:00
Denis Antrushin	99a6e405ed	[IRCE] Use SCEVExpander to modify loop bound IRCE pass checks that it can calculate loop bounds by checking SCEV availability at loop entry. However it is possible that loop bound SCEV is loop invariant, but instruction used to compute it resides within loop. In such case adjusting loop bound in preheader using IRBuilder leads to malformed SSA. Use SCEVExpander instead to generate proper instructions. Reviewed-by: mkazantsev Differential Revision: https://reviews.llvm.org/D73496	2020-02-06 12:44:43 +03:00
Juneyoung Lee	5687acf431	[MemCpyOpt] Simplify find*Alignment	2020-02-06 06:42:07 +09:00
Juneyoung Lee	ad9ae6ee2b	MemCpyOpt cannot use ABI alignment even if it was not given Summary: This patch fixes https://bugs.llvm.org/show_bug.cgi?id=44388 which incorrectly assigns an ABI alignment to memset when there was no explicit alignment given. Reviewers: gchatelet, lenary, nikic Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74083	2020-02-06 06:21:55 +09:00
Kazu Hirata	4698bf145d	Resubmit^2: [JumpThreading] Thread jumps through two basic blocks This reverts commit `41784bed01`. Since the original revision `ead815924e`, this revision fixes three issues: - This revision fixes the Windows build. My original patch improperly copied EH pads on Windows. This patch disregards jump threading opportunities having to do with EH pads. - This revision fixes jump threading to a wrong destination. Specifically, my original patch treated any Constant other than 0 as 1 while evaluating the branch condition. This bug led to treating constant expressions like: icmp ugt i8* null, inttoptr (i64 4 to i8) to "true". This patch fixes the bug by calling isOneValue. - This revision fixes the cost calculation of two basic blocks being threaded through. Note that getJumpThreadDuplicationCost returns "(unsigned)~0" for those basic blocks that cannot be duplicated. If we sum of two return values from getJumpThreadDuplicationCost, we could have an unsigned overflow like: (unsigned)~0 + 5 = 4 and mistakenly determine that it's safe and profitable to proceed with the jump threading opportunity. The patch fixes the bug by checking each return value before summing them up. [JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32 [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-02-05 09:23:37 -08:00
Alina Sbirlea	67904db23c	[IRCE] Make IRCE a Function pass. Summary: Make InductiveRangeCheckElimination a FunctionPass. Reviewers: reames, mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73592	2020-02-05 09:22:41 -08:00
Alina Sbirlea	1c03cc5a39	[NFCI] Update according to style. clang-tidy + clang-format	2020-02-04 17:11:36 -08:00
Thomas Raoux	e53bbf1213	[GVN] Add GVNOption to control load-pre more fine-grained. Adds the global (cl::opt) GVNOption enable-load-in-loop-pre in order to control whether the optimization will be performed if the load is part of a loop. Patch by Hendrik Greving! Differential Revision: https://reviews.llvm.org/D73804	2020-02-03 23:00:58 -08:00
Alina Sbirlea	388de9dfcd	[LoopUtils] Make duplicate method a utility. [NFCI] Summary: Method appendLoopsToWorklist is duplicate in LoopUnroll and in the LoopPassManager as an internal method. Make it an utility. Reviewers: dmgreen, chandlerc, fedor.sergeev, yamauchi Subscribers: mehdi_amini, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73569	2020-02-03 10:24:18 -08:00
Sam Parker	2663a25fad	[JumpThreading] Half the duplicate threshold at Oz Duplicating instructions can lead to code size increases but using a threshold of 3 is good for reducing code size. Differential Revision: https://reviews.llvm.org/D72916	2020-02-03 08:40:20 +00:00
Fangrui Song	ba3a1774a9	[Transforms] Simplify with make_early_inc_range	2020-02-02 00:54:32 -08:00
Artur Pilipenko	34547ac959	NFC. Comments cleanup in DSE::memoryIsNotModifiedBetween Separated from https://reviews.llvm.org/D68006 review.	2020-01-31 15:22:33 -08:00
Whitney Tsang	e44f4a8a54	[LoopFusion] Move instructions from FC1.GuardBlock to FC0.GuardBlock and from FC0.ExitBlock to FC1.ExitBlock when proven safe. Summary: Currently LoopFusion give up when the second loop nest guard block or the first loop nest exit block is not empty. For example: if (0 < N) { for (int i = 0; i < N; ++i) {} x+=1; } y+=1; if (0 < N) { for (int i = 0; i < N; ++i) {} } The above example should be safe to fuse. This PR moves instructions in FC1 guard block (e.g. y+=1;) to FC0 guard block, or instructions in FC0 exit block (e.g. x+=1;) to FC1 exit block, which then LoopFusion is able to fuse them. Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel, bmahjour, etiotto Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D73641	2020-01-30 18:02:22 +00:00
Whitney Tsang	da58e68fdf	[LoopFusion] Move instructions from FC1.Preheader to FC0.Preheader when proven safe. Summary: Currently LoopFusion give up when the second loop nest preheader is not empty. For example: for (int i = 0; i < 100; ++i) {} x+=1; for (int i = 0; i < 100; ++i) {} The above example should be safe to fuse. This PR moves instructions in FC1 preheader (e.g. x+=1; ) to FC0 preheader, which then LoopFusion is able to fuse them. Reviewer: kbarton, Meinersbur, jdoerfert, dmgreen, fhahn, hfinkel, bmahjour, etiotto Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71821	2020-01-29 15:06:11 +00:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Whitney Tsang	cd0cff4392	[NFCI][LoopUnrollAndJam] Minor changes. Summary: 1. Add assertions. 2. Verify more analyses. These changes are moved out of https://reviews.llvm.org/D73129 to simplify that review. Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: dmgreen Subscribers: fhahn, hiraditya, zzheng, llvm-commits, prithayan, anhtuyen Tag: LLVM Differential Revision: https://reviews.llvm.org/D73204	2020-01-28 20:24:23 +00:00
Florian Hahn	5d0ffbeb4d	[Matrix] Mark expressions shared between multiple remarks. This patch adds support for explicitly highlighting sub-expressions shared by multiple leaf nodes. For example consider the following code %shared.load = tail call <8 x double> @llvm.matrix.columnwise.load.v8f64.p0f64(double* %arg1, i32 %stride, i32 2, i32 4), !dbg !10, !noalias !10 %trans = tail call <8 x double> @llvm.matrix.transpose.v8f64(<8 x double> %shared.load, i32 2, i32 4), !dbg !10 tail call void @llvm.matrix.columnwise.store.v8f64.p0f64(<8 x double> %trans, double* %arg3, i32 10, i32 4, i32 2), !dbg !10 %load.2 = tail call <30 x double> @llvm.matrix.columnwise.load.v30f64.p0f64(double* %arg3, i32 %stride, i32 2, i32 15), !dbg !10, !noalias !10 %mult = tail call <60 x double> @llvm.matrix.multiply.v60f64.v8f64.v30f64(<8 x double> %trans, <30 x double> %load.2, i32 4, i32 2, i32 15), !dbg !11 tail call void @llvm.matrix.columnwise.store.v60f64.p0f64(<60 x double> %mult, double* %arg2, i32 10, i32 4, i32 15), !dbg !11 We have two leaf nodes (the 2 stores) and the first store stores %trans which is also used by the matrix multiply %mult. We generate separate remarks for each leaf (stores). To denote that parts are shared, the shared expressions are marked as shared (), with a reference to the other remark that shares it. The operation summary also denotes the shared operations separately. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D72526	2020-01-28 09:27:55 -08:00
Whitney Tsang	78dc64989c	[CodeMoverUtils] Improve IsControlFlowEquivalent. Summary: Currently IsControlFlowEquivalent determine if two blocks are control flow equivalent by checking if A dominates B and B post dominates A. There exists blocks that are control flow equivalent even if they don't satisfy the A dominates B and B post dominates A condition. For example, if (cond) A if (cond) B In the PR, we determine if two blocks are control flow equivalent by also checking if the two sets of conditions A and B depends on are equivalent. Reviewer: jdoerfert, Meinersbur, dmgreen, etiotto, bmahjour, fhahn, hfinkel, kbarton Reviewed By: fhahn Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71578	2020-01-28 14:18:00 +00:00
Florian Hahn	62e228f8fd	[Matrix] Add info about number of operations to remarks. This patch updates the remark to also include a summary of the number of vector operations generated for each matrix expression. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D72480	2020-01-27 17:43:39 -08:00
Florian Hahn	949294f396	[Matrix] Add optimization remarks for matrix expression. Generate remarks for matrix operations in a function. To generate remarks for matrix expressions, the following approach is used: 1. Collect leafs of matrix expressions (done in RemarkGenerator::getExpressionLeafs). Leafs are lowered matrix instructions without other matrix users (like stores). 2. For each leaf, create a remark containing a linearizied version of the matrix expression. The following improvements will be submitted as follow-ups: * Summarize number of vector instructions generated for each expression. * Account for shared sub-expressions. * Propagate matrix remarks up the inlining chain. The information provided by the matrix remarks helps users to spot cases where matrix expression got split up, e.g. due to inlining not happening. The remarks allow users to address those issues, ensuring best performance. Reviewers: anemet, Gerolf, thegameg, hfinkel, andrew.w.kaylor, LuoYuanke Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D72453	2020-01-27 16:39:29 -08:00
Guillaume Chatelet	07c9d53266	[Alignment][NFC] Use Align with CreateAlignedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73449	2020-01-27 10:58:36 +01:00
Alina Sbirlea	0d90d2457c	[LoopStrengthReduce] Teach LoopStrengthReduce to preserve MemorySSA is available.	2020-01-24 10:13:52 -08:00
Alina Sbirlea	1d09174290	[LoopStrengthReduce] Reuse utility method to clean dead instructions. [NFCI] Create a utility wrapper for the RecursivelyDeleteTriviallyDeadInstructions utility method, which sets to nullptr the instructions that are not trivially dead. Use the new method in LoopStrengthReduce. Alternative: add a bool to the same method; this option adds a marginal amount of overhead to the other callers, and the method needs to be updated to return a bool status when it removes/doesn't remove instructions.	2020-01-23 16:27:32 -08:00
Alina Sbirlea	9e66c4ec12	[Utils] Use WeakTrackingVH in vector used as scratch storage. The utility method RecursivelyDeleteTriviallyDeadInstructions receives as input a vector of Instructions, where all inputs are valid instructions. This same vector is used as a scratch storage (per the header comment) to recursively delete instructions. If an instruction is added as an operand of multiple other instructions, it may be added twice, then deleted once, then the second reference in the vector is invalid. Switch to using a Vector<WeakTrackingVH>. This change facilitates a clean-up in LoopStrengthReduction.	2020-01-23 16:04:57 -08:00
Florian Hahn	4ed7355e44	[IPSCCP] Use ParamState for arguments at call sites. We currently use integer ranges to merge concrete function arguments. We use the ParamState range for those, but we only look up concrete values in the regular state. For concrete function arguments that are themselves arguments of the containing function, we can use the param state directly and improve the precision in some cases. Besides improving the results in some cases, this is also a small step towards switching to ValueLatticeElement, by allowing D60582 to be a NFC. Reviewers: efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71836	2020-01-23 13:55:42 -08:00
Alina Sbirlea	6770de9b8d	[LoopIdiomRecognize] Teach LoopIdiomRecognize to preserve MemorySSA.	2020-01-23 11:31:12 -08:00
Alina Sbirlea	a0f627d584	[IndVarSimplify] Fix for MemorySSA preserve.	2020-01-23 11:06:16 -08:00
Guillaume Chatelet	59f95222d4	[Alignment][NFC] Use Align with CreateAlignedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73274	2020-01-23 17:34:32 +01:00
Kazu Hirata	41784bed01	Revert "Resubmit: [JumpThreading] Thread jumps through two basic blocks" This reverts commit `53b68e676f`. Our internal tests are showing breakage with this patch.	2020-01-23 06:34:03 -08:00
Guillaume Chatelet	279fa8e006	[Alignement][NFC] Deprecate untyped CreateAlignedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73260	2020-01-23 13:34:32 +01:00
Daniil Suchkov	6fc9e60149	NFC. Remove obsolete SimpleAnalysis infrastructure Apparently cache of AliasSetTrackers held by LICM was the only user of SimpleAnalysis infrastructure. Now, given that we no longer have that cache, this infrastructure is obsolete and, taking into account its nature, we don't want any new solutions to be based on it. Reviewers: asbirlea, fhahn, efriedma, reames Reviewed-By: asbirlea Differential Revision: https://reviews.llvm.org/D73085	2020-01-23 13:58:30 +07:00
Daniil Suchkov	53a28bd891	[LICM] NFC. Remove AST caching infrastructure Since LICM doesn't use AST caching any more (see D73081), this infrastructure is now obsolete and we can remove it. Reviewers: asbirlea, fhahn, efriedma, reames Reviewed-By: asbirlea Differential Revision: https://reviews.llvm.org/D73084	2020-01-23 12:33:50 +07:00
Jonas Devlieghere	cf2b498d28	[llvm/Transforms] Fix warning: private field 'MSSA' is not used	2020-01-22 18:07:53 -08:00
Alina Sbirlea	adc4faf532	[IndVarSimplify] Teach IndVarSimplify to preserve MemorySSA.	2020-01-22 16:33:17 -08:00
Alina Sbirlea	b5b6126d97	[IndVarSimplify] Cleanup spaces and reduce variable scope [NFCI] Minor clean-ups + clang-format.	2020-01-22 15:32:20 -08:00
Alina Sbirlea	6baf31b7c1	[LoopIdiomRecognize] Reduce variable scope. [NFCI]	2020-01-22 15:30:08 -08:00
Alina Sbirlea	efb130fc93	[LoopDeletion] Teach LoopDeletion to preserve MemorySSA if available. If MemorySSA analysis is analysis, LoopDeletion now preserves it.	2020-01-22 11:38:38 -08:00
Daniil Suchkov	7bdc83f340	[LICM] Don't cache AliasSetTrackers when run under legacy PM Summary: This is the first step towards complete removal of AST caching from LICM. Attempts to keep LICM's AST cache up to date across passes can lead to miscompiles like this one: https://bugs.llvm.org/show_bug.cgi?id=44320. LICM has already switched to using MemorySSA to do sinking and hoisting and only builds an AliasSetTracker on demand for the promoteToScalars step, without caching it from one LICM instance to the next. Given this, we don't have compile-time reasons to keep AST caching any more. The only scenario where the caching would be used currently is when using the LegacyPassManager and setting -enable-mssa-loop-dependency=false. This switch should help us to surface any possible issues that may arise along this way, also it turns subsequent removal of AST caching into NFC. Reviewers: asbirlea, fhahn, efriedma, reames Reviewed By: asbirlea Subscribers: hiraditya, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73081	2020-01-22 13:16:45 +07:00
Florian Hahn	f42994f228	[Matrix] Hide and describe matrix-propagate-shape option.	2020-01-21 14:28:47 -08:00
Guillaume Chatelet	46b9563cf6	[Alignment][NFC] Use Align with CreateElementUnorderedAtomicMemCpy Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, nicolasvasilache Subscribers: hiraditya, jfb, mehdi_amini, rriddle, jpienaar, burmako, shauheen, antiagainst, csigg, arpith-jacob, mgester, lucyrfox, herhut, liufengdb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73041	2020-01-20 15:39:45 +01:00
Sjoerd Meijer	93175a5caa	[IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFCI This moves `rewriteLoopExitValues()` from IndVarSimplify to LoopUtils thus making it a generic loop utility function. This allows to rewrite loop exit values by just calling this function without running the whole IndVarSimplify pass. We use this in D72714 to rematerialise the iteration count in exit blocks, so that we can clean-up loop update expressions inside the hardware-loops later. Differential Revision: https://reviews.llvm.org/D72602	2020-01-20 09:05:00 +00:00
Drew Wock	0bcfafc5e7	[SeparateConstOffsetFromGEP] Fix: sext(a) + sext(b) -> sext(a + b) matches add and sub instructions with one another During the SeparateConstOffsetFromGEP pass, signed extensions are distributed to the values that feed into them and then later recombined. The recombination stage is somewhat problematic- it doesn't differ add and sub instructions from another when matching the sext(a) +/- sext(b) -> sext(a +/- b) pattern in some instances. An example- the IR contains: %unextendedA %unextendedB %subuAuB = unextendedA - unextendedB %extA = extend A %extB = extend B %addeAeB = extA + extB The problematic optimization will transform that into: %unextendedA %unextendedB %subuAuB = unextendedA - unextendedB %extA = extend A %extB = extend B %addeAeB = extend subuAuB ; Obviously not semantically equivalent to the IR input. This patch fixes that. Patch by Drew Wock <drew.wock@sas.com> Differential Revision: https://reviews.llvm.org/D65967	2020-01-17 12:22:52 -05:00
Kazu Hirata	53b68e676f	Resubmit: [JumpThreading] Thread jumps through two basic blocks This reverts commit `2d258ed931`. This revision fixes the Windows build and adds a testcase for it, namely thread-two-bbs3.ll. My original patch improperly copied EH pads on Windows. This patch disregards jump threading opportunities having to do with EH pads. [JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-01-16 12:33:37 -08:00
Arkady Shlykov	c87982b467	Revert "[Loop Peeling] Add possibility to enable peeling on loop nests." This reverts commit `3f3017e` because there's a failure on peel-loop-nests.ll with LLVM_ENABLE_EXPENSIVE_CHECKS on. Differential Revision: https://reviews.llvm.org/D70304	2020-01-16 10:33:38 -08:00
Fedor Sergeev	3478551bf3	[GVN] introduce GVNOptions to control GVN pass behavior There are a few global (cl::opt) controls that enable optional behavior in GVN. Introduce GVNOptions that provide corresponding per-pass instance controls. That will allow to use GVN multiple times in pipeline each time with different settings. Reviewers: asbirlea, rnk, reames, skatkov, fhahn Reviewed By: fhahn Tags: #llvm Differential Revision: https://reviews.llvm.org/D72732	2020-01-16 20:21:08 +03:00
Mircea Trofin	7acfda633f	[llvm] Make new pass manager's OptimizationLevel a class Summary: The old pass manager separated speed optimization and size optimization levels into two unsigned values. Coallescing both in an enum in the new pass manager may lead to unintentional casts and comparisons. In particular, taking a look at how the loop unroll passes were constructed previously, the Os/Oz are now (==new pass manager) treated just like O3, likely unintentionally. This change disallows raw comparisons between optimization levels, to avoid such unintended effects. As an effect, the O{s\|z} behavior changes for loop unrolling and loop unroll and jam, matching O2 rather than O3. The change also parameterizes the threshold values used for loop unrolling, primarily to aid testing. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: zzheng, ychen, mehdi_amini, hiraditya, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72547	2020-01-16 09:00:56 -08:00
Fedor Sergeev	8a4d12ae5b	[BasicBlock] add helper getPostdominatingDeoptimizeCall It appears to be rather useful when analyzing Loops with multiple deoptimizing exits, perhaps merged ones. For now it is used in LoopPredication, will be adding more uses in other loop passes. Reviewers: asbirlea, fhahn, skatkov, spatel, reames Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D72754	2020-01-16 01:15:57 +03:00
Mircea Trofin	5466597fee	[NFC] Refactor InlineResult for readability Summary: InlineResult is used both in APIs assessing whether a call site is inlinable (e.g. llvm::isInlineViable) as well as in the function inlining utility (llvm::InlineFunction). It means slightly different things (can/should inlining happen, vs did it happen), and the implicit casting may introduce ambiguity (casting from 'false' in InlineFunction will default a message about hight costs, which is incorrect here). The change renames the type to a more generic name, and disables implicit constructors. Reviewers: eraman, davidxl Reviewed By: davidxl Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72744	2020-01-15 13:34:20 -08:00
Zhongduo Lin	34ba96a3d4	[NFC][IndVarSimplify] remove duplicate code in widenWithVariantLoadUseCodegen. Summary: Duplicate code in widenWithVariantLoadUseCodegen is removed and also use assert to check unknown extension type as it should be filtered out by the pre condition check before calling this function. Reviewers: az, sanjoy, sebpop, efriedma, javed.absar, sanjoy.google Reviewed By: efriedma Subscribers: hiraditya, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D72652	2020-01-15 16:27:58 -05:00
Arkady Shlykov	3f3017e162	[Loop Peeling] Add possibility to enable peeling on loop nests. Summary: Current peeling implementation bails out in case of loop nests. The patch introduces a field in TargetTransformInfo structure that certain targets can use to relax the constraints if it's profitable (disabled by default). Also additional option is added to enable peeling manually for experimenting and testing purposes. Reviewers: fhahn, lebedev.ri, xbolva00 Reviewed By: xbolva00 Subscribers: xbolva00, hiraditya, zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D70304	2020-01-15 08:25:21 -08:00
Nuno Lopes	87407fc03c	DSE: fix bug where we would only check libcalls for name rather than whole decl	2020-01-11 11:57:29 +00:00
Whitney Tsang	d27a15fed7	[NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function pass. Summary: This patch changes LoopUnrollAndJamPass to a function pass, and keeps the loops traversal order same as defined in FunctionToLoopPassAdaptor LoopPassManager.h. The next patch will change the loop traversal to outer to inner order, so more loops can be transform. Discussion in llvm-dev mailing list: https://groups.google.com/forum/#!topic/llvm-dev/LF4rUjkVI2g Reviewer: dmgreen, jdoerfert, Meinersbur, kbarton, bmahjour, etiotto Reviewed By: dmgreen Subscribers: hiraditya, zzheng, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D72230	2020-01-09 16:18:36 +00:00
Florian Hahn	ccf24225e3	[Matrix] Update shape propagation to iterate until done. This patch updates the shape propagation to iterate until no new shape information is discovered. As initial seed for the forward propagation, we use the matrix intrinsic instructions. Both propagateShapeForward and propagateShapeBackward return new work lists, with the instructions to be used for the next iteration. When propagating forward, we record all instructions we added new shape information for. When propagating backward, we record all users of instructions we added new shape information for. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70901	2020-01-09 10:52:52 +00:00
Florian Hahn	7adf6644f5	[Matrix] Propagate and use shape information for loads. This patch extends to shape propagation to also include load instructions and implements shape aware lowering for vector loads. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70900	2020-01-09 10:21:20 +00:00
Florian Hahn	459ad8e97e	[Matrix] Implement back-propagation of shape information. This patch extends the shape propagation for matrix operations to also propagate the shape of instructions to their operands. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70899	2020-01-09 09:48:07 +00:00
Kazu Hirata	2d258ed931	Revert "[JumpThreading] Thread jumps through two basic blocks" It looks like my patch breaks the sanitizer-windows build: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/56324 This reverts commit `ead815924e`.	2020-01-08 13:58:39 -08:00
Kazu Hirata	ead815924e	[JumpThreading] Thread jumps through two basic blocks Summary: This patch teaches JumpThreading.cpp to thread through two basic blocks like: bb3: %var = phi i32* [ null, %bb1 ], [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 by duplicating basic blocks like bb3 above. Once we duplicate bb3 as bb3.dup and redirect edge bb2->bb3 to bb2->bb3.dup, we have: bb3: %var = phi i32* [ @a, %bb2 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb3.dup: %var = phi i32* [ null, %bb1 ] %tobool = icmp eq i32 %cond, 0 br i1 %tobool, label %bb4, label ... bb4: %cmp = icmp eq i32* %var, null br i1 %cmp, label bb5, label bb6 Then the existing code in JumpThreading.cpp can thread edge bb3.dup->bb4 through bb4 and eventually create bb3.dup->bb5. Reviewers: wmi Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70247	2020-01-08 06:57:36 -08:00
Philip Reames	312a532dc0	[GVN/FP] Considate logic for reasoning about equality vs equivalance for floats Factor out common logic into some reasonable commented helper functions. In the process, ensure that the in-block vs cross-block cases are handled the same. They previously weren't. Differential Revision: https://reviews.llvm.org/D67126	2020-01-07 16:05:04 -08:00
Florian Hahn	b8a3c34eee	Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)." This reverts commit `51ef53f3bd`, as it breaks some bots.	2020-01-04 18:44:38 +00:00
Florian Hahn	51ef53f3bd	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-01-04 18:29:35 +00:00
Ankit	369a919514	Fix for a dangling point bug in DeadStoreElimination pass The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction. While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration. In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container. Reviewers: fhahn, bcahoon, efriedma Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65326	2020-01-03 14:28:44 +00:00
Florian Hahn	dc2c9b0fcf	[Matrix] Propagate and use shape info for binary operators. This patch extends the current shape propagation and shape aware lowering to also support binary operators. Those operators are uniform with respect to their shape (shape of the input operands is the same as the shape of their result). Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70898	2019-12-27 15:50:47 +00:00
Whitney Tsang	d1f41b2ca9	[NFC][LoopFusion] Fix printing of the guard branch. Reviewer: kbarton, jdoerfert Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71878	2019-12-26 02:45:29 +00:00
Florian Hahn	8d6f59b78a	[Matrix] Use fmuladd for matrix.multiply if allowed. If the matrix.multiply calls have the contract fast math flag, we can use fmuladd. This als adds a command line option to force fmuladd generation. We can retire this option once there is a clang-level option. Reviewers: anemet, Gerolf, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70951	2019-12-23 14:49:14 +01:00
Florian Hahn	109e4e3851	[Matrix] Add forward shape propagation and first shape aware lowerings. This patch adds infrastructure for forward shape propagation to LowerMatrixIntrinsics. It also updates the pass to make use of the shape information to break up larger vector operations and to eliminate unnecessary conversion operations between columnwise matrixes and flattened vectors: if shape information is available for an instruction, lower the operation to a set of instructions operating on columns. For example, a store of a matrix is broken down into separate stores for each column. For users that do not have shape information (e.g. because they do not yet support shape information aware lowering), we pack the result columns into a flat vector and update those users. It also adds shape aware lowering for the first non-intrinsic instruction: vector stores. Example: For %c = call <4 x double> @llvm.matrix.transpose(<4 x double> %a, i32 2, i32 2) store <4 x double> %c, <4 x double>* %Ptr We generate the code below without shape propagation. Note %9 which combines the columns of the transposed matrix into a flat vector. %split = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 0, i32 1> %split1 = shufflevector <4 x double> %a, <4 x double> undef, <2 x i32> <i32 2, i32 3> %1 = extractelement <2 x double> %split, i64 0 %2 = insertelement <2 x double> undef, double %1, i64 0 %3 = extractelement <2 x double> %split1, i64 0 %4 = insertelement <2 x double> %2, double %3, i64 1 %5 = extractelement <2 x double> %split, i64 1 %6 = insertelement <2 x double> undef, double %5, i64 0 %7 = extractelement <2 x double> %split1, i64 1 %8 = insertelement <2 x double> %6, double %7, i64 1 %9 = shufflevector <2 x double> %4, <2 x double> %8, <4 x i32> <i32 0, i32 1, i32 2, i32 3> store <4 x double> %9, <4 x double>* %Ptr With this patch, we propagate the 2x2 shape information from the transpose to the store and we generate the code below. Note that we store the columns directly and do not need an extra shuffle. %9 = bitcast <4 x double>* %Ptr to double* %10 = bitcast double* %9 to <2 x double>* store <2 x double> %4, <2 x double>* %10, align 8 %11 = getelementptr double, double* %9, i32 2 %12 = bitcast double* %11 to <2 x double>* store <2 x double> %8, <2 x double>* %12, align 8 Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70897	2019-12-23 13:51:56 +01:00

... 4 5 6 7 8 ...

10320 Commits