llvm-project

Commit Graph

Author	SHA1	Message	Date
Roman Lebedev	36593a30a4	[SimplifyCFG] ConstantFoldTerminator(): switch to non-permissive DomTree updates in `SwitchInst` handling ... which requires not deleting edges that will still be present.	2021-01-08 02:15:24 +03:00
Roman Lebedev	16ab8e5f6d	[SimplifyCFG] ConstantFoldTerminator(): handle matching destinations of condbr earlier We need to handle this case before dealing with the case of constant branch condition, because if the destinations match, latter fold would try to remove the DomTree edge that would still be present. This allows to make that particular DomTree update non-permissive	2021-01-08 02:15:24 +03:00
dfukalov	6a87e9b08b	[NFC][AMDGPU] Reduce include files dependency. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813	2021-01-07 22:22:05 +03:00
Roman Lebedev	6be1fd6b20	[SimplifyCFG] FoldValueComparisonIntoPredecessors(): drop reachable errneous assert I have added it in `d15d81c` because it seemed correct, was holding for all the tests so far, and was validating the fix added in the same commit, but as David Major is pointing out (with a reproducer), the assertion isn't really correct after all. So remove it. Note that the `d15d81c` still fine.	2021-01-07 18:05:04 +03:00
Sidharth Baveja	048f184ee4	[SplitEdge] Add new parameter to SplitEdge to name the newly created basic block Summary: Currently SplitEdge does not support passing in parameter which allows you to name the newly created BasicBlock. This patch updates the function such that the name of the block can be passed in, if users of this utility decide to do so. Reviewed By: Whitney, bmahjour, asbirlea, jamieschmeiser Differential Revision: https://reviews.llvm.org/D94176	2021-01-07 14:49:23 +00:00
Oliver Stannard	76f6b125ce	Revert "[llvm] Use BasicBlock::phis() (NFC)" Reverting because this causes crashes on the 2-stage buildbots, for example http://lab.llvm.org:8011/#/builders/7/builds/1140. This reverts commit `9b228f107d`.	2021-01-07 09:43:33 +00:00
Kazu Hirata	9b228f107d	[llvm] Use BasicBlock::phis() (NFC)	2021-01-06 18:27:35 -08:00
Alina Sbirlea	63aeaf754a	[DominatorTree] Add support for mixed pre/post CFG views. Add support for mixed pre/post CFG views. Update usages of the MemorySSAUpdater to use the new DT API by requesting the DT updates to be done by the MSSAUpdater. Differential Revision: https://reviews.llvm.org/D93371	2021-01-06 14:53:09 -08:00
Arthur Eubanks	7fea561eb1	[CGSCC][Coroutine][NewPM] Properly support function splitting/outlining Previously when trying to support CoroSplit's function splitting, we added in a hack that simply added the new function's node into the original function's SCC (https://reviews.llvm.org/D87798). This is incorrect since it might be in its own SCC. Now, more similar to the previous design, we have callers explicitly notify the LazyCallGraph that a function has been split out from another one. In order to properly support CoroSplit, there are two ways functions can be split out. One is the normal expected "outlining" of one function into a new one. The new function may only contain references to other functions that the original did. The original function must reference the new function. The new function may reference the original function, which can result in the new function being in the same SCC as the original function. The weird case is when the original function indirectly references the new function, but the new function directly calls the original function, resulting in the new SCC being a parent of the original function's SCC. This form of function splitting works with CoroSplit's Switch ABI. The second way of splitting is more specific to CoroSplit. CoroSplit's Retcon and Async ABIs split the original function into multiple functions that all reference each other and are referenced by the original function. In order to keep the LazyCallGraph in a valid state, all new functions must be processed together, else some nodes won't be populated. To keep things simple, this only supports the case where all new edges are ref edges, and every new function references every other new function. There can be a reference back from any new function to the original function, putting all functions in the same RefSCC. This also adds asserts that all nodes in a (Ref)SCC can reach all other nodes to prevent future incorrect hacks. The original hacks in https://reviews.llvm.org/D87798 are no longer necessary since all new functions should have been registered before calling updateCGAndAnalysisManagerForPass. This fixes all coroutine tests when opt's -enable-new-pm is true by default. This also fixes PR48190, which was likely due to the previous hack breaking SCC invariants. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D93828	2021-01-06 11:19:15 -08:00
Francesco Petrogalli	dfd3384fee	[InstCombine] Update valueCoversEntireFragment to use TypeSize * Update valueCoversEntireFragment to use TypeSize. * Add a regression test. * Assertions have been added to protect untested codepaths. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91806	2021-01-06 17:14:59 +00:00
Roman Lebedev	a14945c1db	[SimplifyCFG] SimplifyEqualityComparisonWithOnlyPredecessor(): really don't delete DomTree edges multiple times	2021-01-06 01:52:39 +03:00
Roman Lebedev	2b437fcd47	[SimplifyCFG] SwitchToLookupTable(): switch to non-permissive DomTree updates ... which requires not deleting a DomTree edge that we just deleted.	2021-01-06 01:52:38 +03:00
Roman Lebedev	fa5447aa3f	[NFC][SimplifyCFG] SwitchToLookupTable(): pull out SI->getParent() into a variable	2021-01-06 01:52:38 +03:00
Roman Lebedev	d15d81ce15	[SimplifyCFG] FoldValueComparisonIntoPredecessors(): deal with each predecessor only once If the predecessor is a switch, and BB is not the default destination, multiple cases could have the same destination. and it doesn't make sense to re-process the predecessor, because we won't make any changes, once is enough. I'm not sure this can be really tested, other than via the assertion being added here, which fires without the fix.	2021-01-06 01:52:37 +03:00
Roman Lebedev	fc96cb2dad	[SimplifyCFG] FoldValueComparisonIntoPredecessors(): switch to non-permissive DomTree updates ... which requires not adding a DomTree edge that we just added.	2021-01-06 01:52:37 +03:00
Roman Lebedev	29ca7d5a1a	[SimplifyCFG] simplifyUnreachable(): fix handling of degenerate same-destination conditional branch One would hope that it would have been already canonicalized into an unconditional branch, but that isn't really guaranteed to happen with SimplifyCFG's visitation order.	2021-01-06 01:52:36 +03:00
Roman Lebedev	3460719f58	[NFC][SimplifyCFG] Add a test with same-destination condidional branch Reported by Mikael Holmén as post-commit feedback on https://reviews.llvm.org/rG2d07414ee5f74a09fb89723b4a9bb0818bdc2e18#968162	2021-01-06 01:52:36 +03:00
Roman Lebedev	f98535686e	[SimplifyCFG] simplifyUnreachable(): switch to non-permissive DomTree updates ... which requires not removing a DomTree edge if the switch's default still points at that destination, because it can't be removed; ... and not processing the same predecessor more than once.	2021-01-06 01:52:36 +03:00
Atmn Patel	f88a797521	[LoopDeletion] Allows deletion of possibly infinite side-effect free loops From C11 and C++11 onwards, a forward-progress requirement has been introduced for both languages. In the case of C, loops with non-constant conditionals that do not have any observable side-effects (as defined by 6.8.5p6) can be assumed by the implementation to terminate, and in the case of C++, this assumption extends to all functions. The clang frontend will emit the `mustprogress` function attribute for C++ functions (D86233, D85393, D86841) and emit the loop metadata `llvm.loop.mustprogress` for every loop in C11 or later that has a non-constant conditional. This patch modifies LoopDeletion so that only loops with the `llvm.loop.mustprogress` metadata or loops contained in functions that are required to make progress (`mustprogress` or `willreturn`) are checked for observable side-effects. If these loops do not have an observable side-effect, then we delete them. Loops without observable side-effects that do not satisfy the above conditions will not be deleted. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86844	2021-01-05 09:56:16 -05:00
Simon Pilgrim	a000366d05	[SimplifyIndVar] createWideIV - make WideIVInfo arg a const ref. NFCI. The WideIVInfo arg is only ever used as a const. Fixes cppcheck warning.	2021-01-05 10:31:45 +00:00
Roman Lebedev	32c47ebef1	[SimplifyCFG] SimplifyCondBranchToTwoReturns(): switch to non-permissive DomTree updates ... which requires not deleting an edge that just got deleted, because we could be dealing with a block that didn't go through ConstantFoldTerminator() yet, and thus has a degenerate cond br with matching true/false destinations.	2021-01-05 01:26:37 +03:00
Roman Lebedev	110b3d7855	[SimplifyCFG] SimplifyEqualityComparisonWithOnlyPredecessor(): switch to non-permissive DomTree updates ... which requires not deleting an edge that just got deleted.	2021-01-05 01:26:37 +03:00
Roman Lebedev	a8604e3d5b	[SimplifyCFG] simplifyIndirectBr(): switch to non-permissive DomTree updates ... which requires not deleting an edge that just got deleted.	2021-01-05 01:26:36 +03:00
Roman Lebedev	3fb57222c4	[NFCI] SimplifyCFG: switch to non-permissive DomTree updates, where possible Notably, this doesn't switch every case, remaining cases don't actually pass sanity checks in non-permissve mode, and therefore require further analysis. Note that SimplifyCFG still defaults to not preserving DomTree by default, so this is effectively a NFC change.	2021-01-05 01:26:36 +03:00
Sanjay Patel	36263a7ccc	[LoopUtils] remove redundant opcode parameter; NFC While here, rename the inaccurate getRecurrenceBinOp() because that was also used to get CmpInst opcodes. The recurrence/reduction kind should always refer to the expected opcode for a reduction. SLP appears to be the only direct caller of createSimpleTargetReduction(), and that calling code ideally should not be carrying around both an opcode and a reduction kind. This should allow us to generalize reduction matching to use intrinsics instead of only binops.	2021-01-04 17:05:28 -05:00
Sanjay Patel	9766957524	[LoopUtils] reduce code for creatng reduction; NFC We can return from each case instead creating a temporary variable just to have a common return.	2021-01-04 16:05:03 -05:00
Sanjay Patel	58b6c5d932	[LoopUtils] reorder logic for creating reduction; NFC If we are using a shuffle reduction, we don't need to go through the switch on opcode - return early.	2021-01-04 16:05:02 -05:00
Whitney Tsang	de6d43f16c	Revert "[LoopNest] Allow empty basic blocks without loops" This reverts commit `9a17bff4f7`.	2021-01-04 20:42:21 +00:00
Whitney Tsang	9a17bff4f7	[LoopNest] Allow empty basic blocks without loops Allow loop nests with empty basic blocks without loops in different levels as perfect. Reviewers: Meinersbur Differential Revision: https://reviews.llvm.org/D93665	2021-01-04 19:59:50 +00:00
Philip Reames	7c63aac7bd	Revert "[LoopDeletion] Break backedge of loops when known not taken" This reverts commit `dd6bb367d1`. Multi-stage builders are showing an assertion failure w/LCSSA not being preserved on entry to IndVars. Reason isn't clear, reverting while investigating.	2021-01-04 09:50:47 -08:00
Philip Reames	dd6bb367d1	[LoopDeletion] Break backedge of loops when known not taken The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects. Differential Revision: https://reviews.llvm.org/D93906	2021-01-04 09:19:29 -08:00
Roman Lebedev	98cd1c33e3	[NFC][SimplifyCFG] Hoist 'original' DomTree verification from simplifyOnce() into run() This is NFC since SimplifyCFG still currently defaults to not preserving DomTree. SimplifyCFGOpt::simplifyOnce() is only be called from SimplifyCFGOpt::run(), and can not be called externally, since SimplifyCFGOpt is defined in .cpp This avoids some needless verifications, and is thus a bit faster without sacrificing precision.	2021-01-04 01:02:02 +03:00
Roman Lebedev	a7684940f0	[SimplifyCFG] SimplifyTerminatorOnSelect(): fix/tune DomTree updates We only need to remove non-TrueBB/non-FalseBB successors, and we only need to do that once. We don't need to insert any new edges, because no new successors will be added.	2021-01-04 01:02:02 +03:00
Roman Lebedev	70935b9595	[NFC][SimplifyCFG] SimplifyTerminatorOnSelect(): pull out OldTerm->getParent() into a variable	2021-01-04 01:02:02 +03:00
Roman Lebedev	5fa241a657	[SimplifyCFG] FoldValueComparisonIntoPredecessors(): fine-tune/fix DomTree preservation, take 2	2021-01-03 01:45:48 +03:00
Roman Lebedev	6a3a8d17eb	[SimplifyCFG] FoldValueComparisonIntoPredecessors(): fine-tune/fix DomTree preservation	2021-01-03 01:45:48 +03:00
Roman Lebedev	7c8b8063b6	[SimplifyCFG][AMDGPU] AMDGPUUnifyDivergentExitNodes: SimplifyCFG isn't ready to preserve PostDomTree There is a number of transforms in SimplifyCFG that take DomTree out of DomTreeUpdater, and do updates manually. Until they are fixed, user passes are unable to claim that PDT is preserved. Note that the default for SimplifyCFG is still not to preserve DomTree, so this is still effectively NFC.	2021-01-03 01:45:46 +03:00
Kazu Hirata	530c5af6a4	[Transforms] Construct SmallVector with iterator ranges (NFC)	2021-01-02 09:24:17 -08:00
Roman Lebedev	b9da488ad7	[SimplifyCFG] Don't actually take DomTreeUpdater unless we intend to maintain DomTree validity This guards against unintentional mistakes like the one i just fixed in previous commit.	2021-01-02 14:40:55 +03:00
Roman Lebedev	b4429f3cdd	[SimplifyCFG] Teach removeUndefIntroducingPredecessor to preserve DomTree	2021-01-02 01:01:20 +03:00
Roman Lebedev	657c1e09da	[SimplifyCFG] Teach eliminateDeadSwitchCases() to preserve DomTree, part 2	2021-01-02 01:01:18 +03:00
Roman Lebedev	f1ce696056	[SimplifyCFG] Teach tryWidenCondBranchToCondBranch() to preserve DomTree	2021-01-02 01:01:17 +03:00
Kazu Hirata	f43daf1b62	[SSAUpdater] Remove unused code InstrIsPHI (NFC) The last use of this function was removed on Jan 4, 2018 in commit commit `90ecac01e9`.	2021-01-01 12:44:52 -08:00
Sanjay Patel	c74e8539ff	[Analysis] flatten enums for recurrence types This is almost all mechanical search-and-replace and no-functional-change-intended (NFC). Having a single enum makes it easier to match/reason about the reduction cases. The goal is to remove `Opcode` from reduction matching code in the vectorizers because that makes it harder to adapt the code to handle intrinsics. The code in RecurrenceDescriptor::AddReductionVar() is the only place that required closer inspection. It uses a RecurrenceDescriptor and a second InstDesc to sometimes overwrite part of the struct. It seem like we should be able to simplify that logic, but it's not clear exactly which cmp+sel patterns that we are trying to handle/avoid.	2021-01-01 12:20:16 -05:00
Roman Lebedev	831636b0e6	[SimplifyCFG] SUCCESS! Teach createUnreachableSwitchDefault() to preserve DomTree This pretty much concludes patch series for updating SimplifyCFG to preserve DomTree. All 318 dedicated `-simplifycfg` tests now pass with `-simplifycfg-require-and-preserve-domtree=1`. There are a few leftovers that apparently don't have good test coverage. I do not yet know what gaps in test coverage will the wider-scale testing reveal, but the default flip might be close.	2021-01-01 03:25:25 +03:00
Roman Lebedev	e1440d43bc	[SimplifyCFG] Teach tryToSimplifyUncondBranchWithICmpInIt() to preserve DomTree	2021-01-01 03:25:25 +03:00
Roman Lebedev	8866583953	[SimplifyCFG] Teach FoldValueComparisonIntoPredecessors() to preserve DomTree, part 2	2021-01-01 03:25:24 +03:00
Roman Lebedev	a815b6b2b2	[SimplifyCFG] Teach eliminateDeadSwitchCases() to preserve DomTree, part 1	2021-01-01 03:25:24 +03:00
Roman Lebedev	0d2f219d4d	[SimplifyCFG] Teach SimplifyEqualityComparisonWithOnlyPredecessor() to preserve DomTree, part 3	2021-01-01 03:25:23 +03:00
Roman Lebedev	9f17dab1f4	[SimplifyCFG] Teach simplifyIndirectBr() to preserve DomTree	2021-01-01 03:25:23 +03:00
Roman Lebedev	b7c463d7b8	[SimplifyCFG] Teach FoldBranchToCommonDest() to preserve DomTree, part 2	2021-01-01 03:25:23 +03:00
Roman Lebedev	c1b825d4b8	[SimplifyCFG] Teach FoldValueComparisonIntoPredecessors() to preserve DomTree, part 1	2021-01-01 03:25:22 +03:00
Bogdan Graur	8bee4d4e8f	Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops" Test clang/test/Misc/loop-opt-setup.c fails when executed in Release. This reverts commit `6f1503d598`. Reviewed By: SureYeaah Differential Revision: https://reviews.llvm.org/D93956	2020-12-31 11:47:49 +00:00
Atmn Patel	6f1503d598	[LoopDeletion] Allows deletion of possibly infinite side-effect free loops From C11 and C++11 onwards, a forward-progress requirement has been introduced for both languages. In the case of C, loops with non-constant conditionals that do not have any observable side-effects (as defined by 6.8.5p6) can be assumed by the implementation to terminate, and in the case of C++, this assumption extends to all functions. The clang frontend will emit the `mustprogress` function attribute for C++ functions (D86233, D85393, D86841) and emit the loop metadata `llvm.loop.mustprogress` for every loop in C11 or later that has a non-constant conditional. This patch modifies LoopDeletion so that only loops with the `llvm.loop.mustprogress` metadata or loops contained in functions that are required to make progress (`mustprogress` or `willreturn`) are checked for observable side-effects. If these loops do not have an observable side-effect, then we delete them. Loops without observable side-effects that do not satisfy the above conditions will not be deleted. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86844	2020-12-30 21:43:01 -05:00
Roman Lebedev	7f221c9196	[SimplifyCFG] Teach SwitchToLookupTable() to preserve DomTree	2020-12-30 23:58:41 +03:00
Roman Lebedev	a17025aa61	[SimplifyCFG] Teach switchToSelect() to preserve DomTree	2020-12-30 23:58:40 +03:00
Roman Lebedev	c45f765c0d	[SimplifyCFG] Teach SimplifyBranchOnICmpChain() to preserve DomTree	2020-12-30 23:58:40 +03:00
Sanjay Patel	8ca60db40b	[LoopUtils] reduce FMF and min/max complexity when forming reductions I don't know if there's some way this changes what the vectorizers may produce for reductions, but I have added test coverage with `3567908` and `5ced712` to show that both passes already have bugs in this area. Hopefully this does not make things worse before we can really fix it.	2020-12-30 15:22:26 -05:00
Sanjay Patel	e90ea76380	[IR] remove 'NoNan' param when creating FP reductions This is no-functional-change-intended (AFAIK, we can't isolate this difference in a regression test). That's because the callers should be setting the IRBuilder's FMF field when creating the reduction and/or setting those flags after creating. It doesn't make sense to override this one flag alone. This is part of a multi-step process to clean up the FMF setting/propagation. See PR35538 for an example.	2020-12-30 09:51:23 -05:00
Juneyoung Lee	9b29610228	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Kazu Hirata	16d20e2554	[Transforms/Utils] Construct SmallVector with iterator ranges (NFC)	2020-12-29 19:23:23 -08:00
Roman Lebedev	39a56f7f17	[SimplifyCFG] Teach SimplifyTerminatorOnSelect() to preserve DomTree	2020-12-30 00:48:12 +03:00
Roman Lebedev	ec0b671a61	[SimplifyCFG] Teach SimplifyCondBranchToCondBranch() to preserve DomTree	2020-12-30 00:48:12 +03:00
Roman Lebedev	307156246f	[SimplifyCFG] Teach mergeConditionalStoreToAddress() to preserve DomTree	2020-12-30 00:48:11 +03:00
Roman Lebedev	d4c0abb4a3	[SimplifyCFG] Teach FoldCondBranchOnPHI() to preserve DomTree	2020-12-30 00:48:11 +03:00
Roman Lebedev	b8121b2e62	[SimplifyCFG] Teach SinkCommonCodeFromPredecessors() to preserve DomTree	2020-12-30 00:48:11 +03:00
Roman Lebedev	18c407bf4c	[SimplifyCFG] Teach HoistThenElseCodeToIf() to preserve DomTree	2020-12-30 00:48:10 +03:00
Roman Lebedev	fe9bdd9621	[SimplifyCFG] Teach SimplifyEqualityComparisonWithOnlyPredecessor() to preserve DomTree, part 2	2020-12-30 00:48:10 +03:00
Roman Lebedev	6027e05dbf	[SimplifyCFG] Teach SimplifyEqualityComparisonWithOnlyPredecessor() to preserve DomTree, part 1	2020-12-30 00:48:10 +03:00
Sanjay Patel	8d18bc8e6d	[Utils] reduce code in createTargetReduction(); NFC The switch duplicated the translation in getRecurrenceBinOp(). This code is still weird because it translates to the TTI ReductionFlags for min/max, but then createSimpleTargetReduction() converts that back to RecurrenceDescriptor::MinMaxRecurrenceKind.	2020-12-29 15:56:19 -05:00
Roman Lebedev	ef93f7a11c	[SimplifyCFG] FoldBranchToCommonDest: gracefully handle unreachable code () We might be dealing with an unreachable code, so the bonus instruction we clone might be self-referencing. There is a sanity check that all uses of bonus instructions that are not in the original block with said bonus instructions are PHI nodes, and that is obviously not the case for self-referencing instructions.. So if we find such an use, just rewrite it. Thanks to Mikael Holmén for the reproducer! Fixes https://bugs.llvm.org/show_bug.cgi?id=48450#c8	2020-12-28 23:31:19 +03:00
Yevgeny Rouban	d76c1d2247	[RS4GC] Lazily set changed flag when folding single entry phis The function FoldSingleEntryPHINodes() is changed to return if it has changed IR or not. This return value is used by RS4GC to set the MadeChange flag respectively. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D93810	2020-12-28 10:54:21 +07:00
Florian Hahn	0ea3749b3c	[LV] Set up branch from middle block earlier. Previously the branch from the middle block to the scalar preheader & exit was being set-up at the end of skeleton creation in completeLoopSkeleton. Inserting SCEV or runtime checks may result in LCSSA phis being created, if they are required. Adjusting branches afterwards may break those PHIs. To avoid this, we can instead create the branch from the middle block to the exit after we created the middle block, so we have the final CFG before potentially adjusting/creating PHIs. This fixes a crash for the included test case. For the non-crashing case, this is almost a NFC with respect to the generated code. The only change is the order of the predecessors of the involved branch targets. Note an assertion was moved from LoopVersioning() to LoopVersioning::versionLoop. Adjusting the branches means loop-simplify form may be broken before constructing LoopVersioning. But LV only uses LoopVersioning to annotate the loop instructions with !noalias metadata, which does not require loop-simplify form. This is a fix for an existing issue uncovered by D93317.	2020-12-27 18:21:12 +00:00
Kazu Hirata	8299fb8f25	[Transforms] Use llvm::append_range (NFC)	2020-12-27 09:57:29 -08:00
Kazu Hirata	789d250613	[CodeGen, Transforms] Use *Map::lookup (NFC)	2020-12-27 09:57:27 -08:00
Kazu Hirata	46bea9b297	[Local] Remove unused function RemovePredecessorAndSimplify (NFC) The last use of the function was removed on Sep 29, 2010 in commit `99c985c37d`.	2020-12-25 09:35:20 -08:00
Roman Lebedev	ff3749fc79	[NFC] SimplifyCFGOpt::simplifyUnreachable(): pacify unused variable warning Thanks to Luke Benes for pointing it out.	2020-12-24 21:20:46 +03:00
Kazu Hirata	df812115e3	[CodeGen, Transforms] Use llvm::any_of (NFC)	2020-12-24 09:08:36 -08:00
Roman Lebedev	c043f5055e	[SimplifyCFG] Teach FoldBranchToCommonDest() to preserve DomTree, part 1 ... for conditional branch case	2020-12-20 00:18:36 +03:00
Roman Lebedev	262ff9c23e	[SimplifyCFG] Teach TryToMergeLandingPad() to preserve DomTree	2020-12-20 00:18:36 +03:00
Roman Lebedev	6a1617d67c	[SimplifyCFG] Teach SimplifyCondBranchToTwoReturns() to preserve DomTree, part 2 ... for the custom case returning void.	2020-12-20 00:18:36 +03:00
Roman Lebedev	b94520c9ee	[SimplifyCFG] Teach SimplifyCondBranchToTwoReturns() to preserve DomTree, part 1 ... for the general case of returning a value.	2020-12-20 00:18:35 +03:00
Roman Lebedev	4d87a6ad13	[NFCI][SimplifyCFG] SimplifyCondBranchToTwoReturns(): pull out BI->getParent() into a variable	2020-12-20 00:18:35 +03:00
Roman Lebedev	83659c7076	[SimplifyCFG] simplifySingleResume(): FoldReturnIntoUncondBranch() already knows how to preserve DomTree ... so just ensure that we pass DomTreeUpdater it into it. Apparently, there were no dedicated tests just for that functionality, so i'm adding one here.	2020-12-20 00:18:34 +03:00
Roman Lebedev	b7d00e29b7	[SimplifyCFG] Teach simplifySingleResume() to preserve DomTree	2020-12-20 00:18:34 +03:00
Roman Lebedev	c209b88dd4	[SimplifyCFG] Teach simplifyCommonResume() to preserve DomTree	2020-12-20 00:18:34 +03:00
Roman Lebedev	76e74d9395	[SimplifyCFG] Teach removeEmptyCleanup() to preserve DomTree	2020-12-20 00:18:33 +03:00
Roman Lebedev	4be8707e64	[SimplifyCFG] Teach FoldTwoEntryPHINode() to preserve DomTree Still boring, simply drop all edges to successors of DomBlock, and add an edge to to BB instead.	2020-12-20 00:18:33 +03:00
Roman Lebedev	b43b77ff9b	[NFCI][SimlifyCFG] simplifyOnce(): also perform DomTree validation And that exposes that a number of tests don't actually manage to maintain DomTree validity, which is inline with my observations. Once again, SimlifyCFG pass currently does not require/preserve DomTree by default, so this is effectively NFC.	2020-12-20 00:18:32 +03:00
Whitney Tsang	2a814cd9e1	Ensure SplitEdge to return the new block between the two given blocks This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’. Below is an example of splitting the block bb1 at its first instruction. /// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: Differential Revision: https://reviews.llvm.org/D92200	2020-12-18 17:37:17 +00:00
Yevgeny Rouban	f0e3d1d6ca	[IndVars] Fix adding trunc instructions to unwind blocks Truncate instruction must not be inserted before landing pads. The insertion point is fixed.	2020-12-18 12:52:23 +07:00
Kazu Hirata	b621116716	[Transforms] Use llvm::erase_if (NFC)	2020-12-17 19:53:10 -08:00
Rong Xu	31c0b8700b	Fix clang-ppc64le-rhel buildbot build error ix buildbot build error due to commit 3733463d: [IR][PGO] Add hot func attribute and use hot/cold attribute in func section	2020-12-17 19:14:43 -08:00
Roman Lebedev	2d07414ee5	[SimplifyCFG] Teach simplifyUnreachable() to preserve DomTree Pretty boring, removeUnwindEdge() already known how to update DomTree, so if we are to call it, we must first flush our own pending updates; otherwise, we just stop predecessors from branching to us, and for certain predecessors, stop their predecessors from branching to them also.	2020-12-18 00:37:22 +03:00
Roman Lebedev	2ee724863e	[SimplifyCFG] ConstantFoldTerminator() already knows how to preserve DomTree ... so just ensure that we pass DomTreeUpdater it into it. Fixes DomTree preservation for a number of tests, all of which are marked as such so that they do not regress.	2020-12-18 00:37:22 +03:00
Roman Lebedev	164e0847a5	[SimplifyCFG] DeleteDeadBlock() already knows how to preserve DomTree ... so just ensure that we pass DomTreeUpdater it into it. Fixes DomTree preservation for a large number of tests, all of which are marked as such so that they do not regress.	2020-12-18 00:37:21 +03:00
Bangtian Liu	511cfe9441	Revert "Ensure SplitEdge to return the new block between the two given blocks" This reverts commit `d20e0c3444`.	2020-12-17 21:00:37 +00:00
Nabeel Omer	df2b9a3e02	[DebugInfo] Avoid re-ordering assignments in LCSSA The LCSSA pass makes use of a function insertDebugValuesForPHIs() to propogate dbg.value() intrinsics to newly inserted PHI instructions. Faulty behaviour occurs when the parent PHI of a newly inserted PHI is not the most recent assignment to a source variable. insertDebugValuesForPHIs ends up propagating a value that isn't the most recent assignemnt. This change removes the call to insertDebugValuesForPHIs() from LCSSA, preventing incorrect dbg.value intrinsics from being propagated. Propagating variable locations between blocks will occur later, during LiveDebugValues. Differential Revision: https://reviews.llvm.org/D92576	2020-12-17 16:17:32 +00:00
Bangtian Liu	d20e0c3444	Ensure SplitEdge to return the new block between the two given blocks This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’. Below is an example of splitting the block bb1 at its first instruction. /// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: Differential Revision: https://reviews.llvm.org/D92200	2020-12-17 16:00:15 +00:00
Florian Hahn	75c04bfc61	[SimplifyCFG] Preserve !annotation in FoldBranchToCommonDest. When folding a branch to a common destination, preserve !annotation on the created instruction, if the terminator of the BB that is going to be removed has !annotation. This should ensure that !annotation is attached to the instructions that 'replace' the original terminator. Reviewed By: jdoerfert, lebedev.ri Differential Revision: https://reviews.llvm.org/D93410	2020-12-17 14:06:58 +00:00
dfukalov	9ed8e0caab	[NFC] Reduce include files dependency and AA header cleanup (part 2). Continuing work started in https://reviews.llvm.org/D92489: Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h". Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92852	2020-12-17 14:04:48 +03:00
Roman Lebedev	5cce4aff18	[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock() already knows how to preserve DomTree ... so just ensure that we pass DomTreeUpdater it into it. Fixes DomTree preservation for a large number of tests, all of which are marked as such so that they do not regress.	2020-12-17 01:03:49 +03:00
Roman Lebedev	49dac4aca0	[SimplifyCFG] MergeBlockIntoPredecessor() already knows how to preserve DomTree ... so just ensure that we pass DomTreeUpdater it into it. Fixes DomTree preservation for a large number of tests, all of which are marked as such so that they do not regress.	2020-12-17 01:03:49 +03:00
Bangtian Liu	c10757200d	Revert "Ensure SplitEdge to return the new block between the two given blocks" This reverts commit `cf638d793c`.	2020-12-16 11:52:30 +00:00
Bangtian Liu	cf638d793c	Ensure SplitEdge to return the new block between the two given blocks This PR implements the function splitBasicBlockBefore to address an issue that occurred during SplitEdge(BB, Succ, ...), inside splitBlockBefore. The issue occurs in SplitEdge when the Succ has a single predecessor and the edge between the BB and Succ is not critical. This produces the result ‘BB->Succ->New’. The new function splitBasicBlockBefore was added to splitBlockBefore to handle the issue and now produces the correct result ‘BB->New->Succ’. Below is an example of splitting the block bb1 at its first instruction. /// Original IR bb0: br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlock bb0: br bb1 bb1: br bb1.split bb1.split: %0 = mul i32 1, 2 br bb2 bb2: /// IR after splitEdge(bb0, bb1) using splitBasicBlockBefore bb0: br bb1.split bb1.split br bb1 bb1: %0 = mul i32 1, 2 br bb2 bb2: Differential Revision: https://reviews.llvm.org/D92200	2020-12-15 23:32:29 +00:00
Roman Lebedev	e113317958	[NFCI][SimplifyCFG] Add basic scaffolding for gradually making the pass DomTree-aware Two observations: 1. Unavailability of DomTree makes it impossible to make `FoldBranchToCommonDest()` transform in certain cases, where the successor is dominated by predecessor, because we then don't have PHI's, and can't recreate them, well, without handrolling 'is dominated by' check, which doesn't really look like a great solution to me. 2. Avoiding invalidating DomTree in SimplifyCFG will decrease the number of `Dominator Tree Construction` by 5 (from 28 now, i.e. -18%) in `-O3` old-pm pipeline (as per `llvm/test/Other/opt-O3-pipeline.ll`) This might or might not be beneficial for compile time. So the plan is to make SimplifyCFG preserve DomTree, and then eventually make DomTree fully required and preserved by the pass. Now, SimplifyCFG is ~7KLOC. I don't think it will be nice to do all this uplifting in a single mega-commit, nor would it be possible to review it in any meaningful way. But, i believe, it should be possible to do this in smaller steps, introducing the new behavior, in an optional way, off-by-default, opt-in option, and gradually fixing transforms one-by-one and adding the flag to appropriate test coverage. Then, eventually, the default should be flipped, and eventually^2 the flag removed. And that is what is happening here - when the new off-by-default option is specified, DomTree is required and is claimed to be preserved, and SimplifyCFG-internal assertions verify that the DomTree is still OK.	2020-12-16 00:38:00 +03:00
Nico Weber	a852ee199c	Reland "[MachineDebugify] Insert synthetic DBG_VALUE instructions" This reverts commit `841f9c937f`. The change landed many months ago; something else broke those tests.	2020-12-14 22:34:23 -05:00
Nico Weber	841f9c937f	Revert "[MachineDebugify] Insert synthetic DBG_VALUE instructions" This reverts commit `2a5675f11d`. The tests it adds fail: https://reviews.llvm.org/D78135#2453736	2020-12-14 22:14:48 -05:00
Gulfem Savrun Yeniceri	7c0e3a77bc	[clang][IR] Add support for leaf attribute This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275	2020-12-14 14:48:17 -08:00
Philip Reames	f5fe8493e5	[LAA] Relax restrictions on early exits in loop structure his is a preparation patch for supporting multiple exits in the loop vectorizer, by itself it should be mostly NFC. This patch moves the loop structure checks from LAA to their respective consumers (where duplicates don't already exist). Moving the checks does end up changing some of the optimization warnings and debug output slightly, but nothing that appears to be a regression. Why do this? Well, after auditing the code, I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times. This patch simply makes this explicit so that if one consumer - say LV in the near future (hopefully) - wants to handle a broader class of loops, it can do so. Differential Revision: https://reviews.llvm.org/D92066	2020-12-14 12:44:01 -08:00
Roman Lebedev	59560e8589	[SimplifyCFG] FoldBranchToCommonDest(): temporairly put back restrictions on liveout uses of bonus instructions (PR48450) Even though `d38205144f` was mostly a correct fix for the external non-PHI users, it's not a generally correct fix, because the 'placeholder' values in those trivial PHI's we create shouldn't be always 'undef', but the PHI itself for the backedges, else we end up with wrong value, as the `@pr48450_2` test shows. But we can't just do that, because we can't check that the PHI can be it's own incoming value when coming from certain predecessor, because we don't have a dominator tree. So until we can address this correctness problem properly, ensure that we don't perform the transformation if there are such problematic external uses. Making dominator tree available there is going to be involved, since `-simplifycfg` pass currently does not preserve/update domtree...	2020-12-14 20:14:31 +03:00
Roman Lebedev	e8360a8e1e	[NFC][SimplifyCFG] FoldBranchToCommonDest(): pull out 'common successor' into a variable Makes it easier to use it elsewhere	2020-12-14 20:14:31 +03:00
Kazu Hirata	5891ad4e22	[Transforms] Use llvm::erase_value (NFC)	2020-12-13 09:48:47 -08:00
Roman Lebedev	d38205144f	[SimplifyCFG] FoldBranchToCommonDest(): bonus instrns must only be used by PHI nodes in successors (PR48450) In particular, if the successor block, which is about to get a new predecessor block, currently only has a single predecessor, then the bonus instructions will be directly used within said successor, which is fine, since the block with bonus instructions dominates that successor. But once there's a new predecessor, the IR is no longer valid, and we don't fix it, because we only update PHI nodes. Which means, the live-out bonus instructions must be exclusively used by the PHI nodes in successor blocks. So we have to form trivial PHI nodes. which will then be successfully updated to recieve cloned bonus instns. This all works fine, except for the fact that we don't have access to the dominator tree, and we don't ignore unreachable code, so we sometimes do end up having to deal with some weird IR. Fixes https://bugs.llvm.org/show_bug.cgi?id=48450	2020-12-13 00:06:57 +03:00
Fangrui Song	b5ad32ef5c	Migrate deprecated DebugLoc::get to DILocation::get This migrates all LLVM (except Kaleidoscope and CodeGen/StackProtector.cpp) DebugLoc::get to DILocation::get. The CodeGen/StackProtector.cpp usage may have a nullptr Scope and can trigger an assertion failure, so I don't migrate it. Reviewed By: #debug-info, dblaikie Differential Revision: https://reviews.llvm.org/D93087	2020-12-11 12:45:22 -08:00
Philip Reames	5171b7b40e	[indvars] Common a bit of code [NFC]	2020-12-08 15:25:48 -08:00
Valentin Churavy	700cf7dcc9	[VNCoercion] Disallow coercion between different ni addrspaces I'm not sure if it would be legal by the IR reference to introduce an addrspacecast here, since the IR reference is a bit vague on the exact semantics, but at least for our usage of it (and I suspect for many other's usage) it is not. For us, addrspacecasts between non-integral address spaces carry frontend information that the optimizer cannot deduce afterwards in a generic way (though we have frontend specific passes in our pipline that do propagate these). In any case, I'm sure nobody is using it this way at the moment, since it would have introduced inttoptrs, which are definitely illegal. Fixes PR38375 Co-authored-by: Keno Fischer <keno@alumni.harvard.edu> Reviewed By: reames Differential Revision: https://reviews.llvm.org/D50010	2020-12-07 20:19:48 -05:00
Fangrui Song	2832f3528c	[Transforms] Delete unused declarations from NewGVN/CoroSplit/ValueMapper	2020-12-06 13:04:01 -08:00
Hiroshi Yamauchi	f9c3954a6e	Fix for Bug 48055. Differential Revision: https://reviews.llvm.org/D92599	2020-12-04 11:05:01 -08:00
Max Kazantsev	12b6c5e682	Return "[IndVars] ICmpInst should not prevent IV widening" This reverts commit `4bd35cdc3a`. The patch was reverted during the investigation. The investigation shown that the patch did not cause any trouble, but just exposed the existing problem that is addressed by the previous patch "[IndVars] Quick fix LHS/RHS bug". Returning without changes.	2020-12-04 12:34:43 +07:00
Max Kazantsev	3df0daceb2	[IndVars] Quick fix LHS/RHS bug The code relies on fact that LHS is the NarrowDef but never really checks it. Adding the conservative restrictive check, will follow-up with handling of case where RHS is a NarrowDef.	2020-12-04 12:34:42 +07:00
dfukalov	2ce38b3f03	[NFC] Reduce include files dependency. 1. Removed #include "...AliasAnalysis.h" in other headers and modules. 2. Cleaned up includes in AliasAnalysis.h. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92489	2020-12-03 18:25:05 +03:00
Max Kazantsev	4bd35cdc3a	Revert "[IndVars] ICmpInst should not prevent IV widening" This reverts commit `0c9c6ddf17`. We are seeing some failures with this patch locally. Not clear if it's causing them or just triggering a problem in another place. Reverting while investigating.	2020-12-03 18:01:41 +07:00
David Sherwood	71bd59f0cb	[SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute In this patch I have added support for a new loop hint called vectorize.scalable.enable that says whether we should enable scalable vectorization or not. If a user wants to instruct the compiler to vectorize a loop with scalable vectors they can now do this as follows: br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2 ... !2 = !{!2, !3, !4} !3 = !{!"llvm.loop.vectorize.width", i32 8} !4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true} Setting the hint to false simply reverts the behaviour back to the default, using fixed width vectors. Differential Revision: https://reviews.llvm.org/D88962	2020-12-02 13:23:43 +00:00
Roman Lebedev	15f8060f6f	[SimplifyCFG] FoldBranchToCommonDest: don't require that cmp of br is last instruction There is no correctness need for that, and since we allow live-out uses, this could theoretically happen, because currently nothing will move the cond to right before the branch in those tests. But regardless, lifting that restriction even makes the transform easier to understand. This makes the transform happen in 81 more cases (+0.55%) )	2020-12-01 15:13:06 +03:00
Roman Lebedev	b0e9b7c59f	[NFC][SimplifyCFG] Add STATISTIC() to the FoldValueComparisonIntoPredecessors() fold	2020-11-30 12:27:16 +03:00
Max Kazantsev	0c9c6ddf17	[IndVars] ICmpInst should not prevent IV widening If we decided to widen IV with zext, then unsigned comparisons should not prevent widening (same for sext/sign comparisons). The result of comparison in wider type does not change in this case. Differential Revision: https://reviews.llvm.org/D92207 Reviewed By: nikic	2020-11-30 10:51:31 +07:00
Roman Lebedev	b33fbbaa34	Reland [SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions This was orginally committed in `2245fb8aaa`. but was immediately reverted in `f3abd54958` because of a PHI handling issue. Original commit message: 1. It doesn't make sense to enforce that the bonus instruction is only used once in it's basic block. What matters is whether those user instructions fit within our budget, sure, but that is another question. 2. It doesn't make sense to enforce that said bonus instructions are only used within their basic block. Perhaps the branch condition isn't using the value computed by said bonus instruction, and said bonus instruction is simply being calculated to be used in successors? So iff we can clone bonus instructions, to lift these restrictions, we just need to carefully update their external uses to use the new cloned instructions. Notably, this transform (even without this change) appears to be poison-unsafe as per alive2, but is otherwise (including the patch) legal. We don't introduce any new PHI nodes, but only "move" the instructions around, i'm not really seeing much potential for extra cost modelling for the transform, especially since now we allow at most one such bonus instruction by default. This causes the fold to fire +11.4% more (13216 -> 14725) as of vanilla llvm test-suite + RawSpeed. The motivational pattern is IEEE-754-2008 Binary16->Binary32 extension code: `ca57d77fb2/src/librawspeed/common/FloatingPoint.h (L115-L120)` ^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM apparently that fold is happening somewhere else afterall, so something else also has a similar 'artificial' restriction.	2020-11-27 12:47:15 +03:00
Max Kazantsev	faf183874c	[IndVars] LCSSA Phi users should not prevent widening When widening an IndVar that has LCSSA Phi users outside the loop, we can safely widen it as usual and then truncate the result outside the loop without hurting the performance. Differential Revision: https://reviews.llvm.org/D91593 Reviewed By: skatkov	2020-11-27 11:19:54 +07:00
Roman Lebedev	f3abd54958	Revert "[SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions" Many bots are unhappy, at the very least missed a few codegen tests, and possibly this has a logic hole inducing a miscompile (will be really awesome to have ready reproducer..) Need to investigate. This reverts commit `2245fb8aaa`.	2020-11-26 23:13:43 +03:00
Roman Lebedev	2245fb8aaa	[SimplifyCFG] FoldBranchToCommonDest: lift use-restriction on bonus instructions 1. It doesn't make sense to enforce that the bonus instruction is only used once in it's basic block. What matters is whether those user instructions fit within our budget, sure, but that is another question. 2. It doesn't make sense to enforce that said bonus instructions are only used within their basic block. Perhaps the branch condition isn't using the value computed by said bonus instruction, and said bonus instruction is simply being calculated to be used in successors? So iff we can clone bonus instructions, to lift these restrictions, we just need to carefully update their external uses to use the new cloned instructions. Notably, this transform (even without this change) appears to be poison-unsafe as per alive2, but is otherwise (including the patch) legal. We don't introduce any new PHI nodes, but only "move" the instructions around, i'm not really seeing much potential for extra cost modelling for the transform, especially since now we allow at most one such bonus instruction by default. This causes the fold to fire +11.4% more (13216 -> 14725) as of vanilla llvm test-suite + RawSpeed. The motivational pattern is IEEE-754-2008 Binary16->Binary32 extension code: `ca57d77fb2/src/librawspeed/common/FloatingPoint.h (L115-L120)` ^ that should be a switch, but it is not now: https://godbolt.org/z/bvja5v That being said, even thought this seemed like this would fix it: https://godbolt.org/z/xGq3TM apparently that fold is happening somewhere else afterall, so something else also has a similar 'artificial' restriction.	2020-11-26 22:51:22 +03:00
Roman Lebedev	65db7d38e0	[NFC][SimplifyCFG] Add statistic to `FoldBranchToCommonDest()` fold	2020-11-26 22:51:21 +03:00
Zhengyang Liu	345fcccb33	Fix use-of-uninitialized-value in rG75f50e15bf8f Differential Revision: https://reviews.llvm.org/D71126	2020-11-26 01:39:22 -07:00
Max Kazantsev	28d7ba1543	[IndVars] Use more precise context when eliminating narrowing When deciding to widen narrow use, we may need to prove some facts about it. For proof, the context is used. Currently we take the instruction being widened as the context. However, we may be more precise here if we take as context the point that dominates all users of instruction being widened. Differential Revision: https://reviews.llvm.org/D90456 Reviewed By: skatkov	2020-11-25 11:47:39 +07:00
Philip Reames	10ddb927c1	[SCEV] Use isa<> pattern for testing for CouldNotCompute [NFC] Some older code - and code copied from older code - still directly tested against the singelton result of SE::getCouldNotCompute. Using the isa<SCEVCouldNotCompute> form is both shorter, and more readable.	2020-11-24 18:47:49 -08:00
Kazu Hirata	df73b8c174	[ValueMapper] Remove unused declaration remapFunction (NFC) The function declaration with two parameters was introduced on Apr 16 2016 in commit `f0d73f95c1` without a corresponding definition.	2020-11-22 21:52:03 -08:00
Hongtao Yu	f3c445697d	[CSSPGO] IR intrinsic for pseudo-probe block instrumentation This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues: 1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting. We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality. Let's now look at an example. Given the following LLVM IR: ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ``` The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID. ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86490	2020-11-20 10:39:24 -08:00
Max Kazantsev	515105f46b	[NFC] Remove comment (commited ahead of time by mistake)	2020-11-19 16:28:34 +07:00
Max Kazantsev	7c601d09a7	[NFC] Move code earlier as preparation for further changes	2020-11-19 16:27:23 +07:00
Kazu Hirata	43c0e4f665	[Transforms] Use llvm::is_contained (NFC)	2020-11-18 20:42:22 -08:00
Nikita Popov	f4a3969bff	[Inline] Fix incorrectly dropped noalias metadata This is the same fix as `23aeadb89d`, just for CloneScopedAliasMetadata rather than PropagateCallSiteMetadata. In this case the previous outcome was incorrectly dropped metadata, as it was not part of the computed metadata map. The real change in the test is that the first load now retains metadata, the rest of the changes are due to changes in metadata numbering.	2020-11-18 21:22:50 +01:00
Nikita Popov	23aeadb89d	[Inline] Fix incorrect noalias metadata application (PR48209) The VMap also contains a mapping from Argument => Instruction, where the instruction is part of the original function, not the inlined one. The code was assuming that all the instructions in the VMap were inlined. This was a pre-existing problem for the loop access metadata, but was extended to the more common noalias metadata by `27f647d117`, thus causing miscompiles. There is a similar assumption inside CloneAliasScopeMetadata(), so that one likely needs to be fixed as well.	2020-11-18 20:52:58 +01:00
Nick Desaulniers	f4c6080ab8	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit `b7926ce6d7`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Matt Arsenault	c5ce6036c1	Linker: Fix linking of byref types This wasn't properly remapping the type like with the other attributes, so this would end up hitting a verifier error after linking different modules using byref.	2020-11-17 11:02:04 -05:00
Max Kazantsev	63dd1734b2	[NFC] Collect ext users into vector instead of finding them twice	2020-11-17 14:01:43 +07:00
Kazu Hirata	1da60f1d44	[Transforms] Use pred_empty (NFC)	2020-11-16 22:09:14 -08:00
Arthur Eubanks	7de6dcd246	[Debugify] Skip debugifying on special/immutable passes With a function pass manager, it would insert debuginfo metadata before getting to function passes while processing the pass manager, causing debugify to skip while running the function passes. Skip special passes + verifier + printing passes. Compared to the legacy implementation of -debugify-each, this additionally skips verifier passes. Probably no need to update the legacy version since it will be obsolete soon. This fixes 2 instcombine tests using -debugify-each under NPM. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D91558	2020-11-16 20:39:46 -08:00
Roman Lebedev	6861d938e5	Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485 the implementation is known-broken for certain inputs, the bugreport was up for a significant amount of timer, and there has been no activity to address it. Therefore, just completely rip out all of misexpect handling. I suspect, fixing it requires redesigning the internals of MD_misexpect. Should anyone commit to fixing the implementation problem, starting from clean slate may be better anyways. This reverts commit `7bdad08429`, and some of it's follow-ups, that don't stand on their own.	2020-11-14 13:12:38 +03:00
serge-sans-paille	9218ff50f9	llvmbuildectomy - replace llvm-build by plain cmake No longer rely on an external tool to build the llvm component layout. Instead, leverage the existing `add_llvm_componentlibrary` cmake function and introduce `add_llvm_component_group` to accurately describe component behavior. These function store extra properties in the created targets. These properties are processed once all components are defined to resolve library dependencies and produce the header expected by llvm-config. Differential Revision: https://reviews.llvm.org/D90848	2020-11-13 10:35:24 +01:00
Max Kazantsev	0a1d394bf3	[NFC] Refactor loop-invariant getters to return Optional	2020-11-13 15:03:10 +07:00
Max Kazantsev	d6dd938589	[IndVars] IV user should not prevent use widening Sometimes the an instruction we are trying to widen is used by the IV (which means the instruction is the IV increment). Currently this may prevent its widening. We should ignore such user because it will be dead once the transform is done anyways. Differential Revision: https://reviews.llvm.org/D90920 Reviewed By: fhahn	2020-11-12 12:02:01 +07:00
Max Kazantsev	2e01ceafaa	[IndVars] Recognize 'sub nuw' expressed as 'add' for widening InstCombine canonicalizes 'sub nuw' instructions to 'add' without the `nuw` flag. The typical case where we see it is decrementing induction variables. For them, IndVars fails to prove that it's legal to widen them, and inserts unprofitable `zext`'s. This patch adds recognition of such pattern using SCEV. Differential Revision: https://reviews.llvm.org/D89550 Reviewed By: fhahn, skatkov	2020-11-12 10:51:29 +07:00
Jonas Paulsson	89a1042b6a	Make inferLibFuncAttributes() add SExt attribute on second arg to ldexp. This was missing as discovered by the SystemZ multistage bot: http://lab.llvm.org:8011/#/builders/8, where wrong code resulted when this extension was not performed. Thanks for review by Ulrich Weigand and Roman Lebedev. Differential Revision: https://reviews.llvm.org/D90760	2020-11-10 18:32:15 +01:00
David Green	c7e275388e	[ARM] Don't aggressively unroll vector remainder loops We already do not unroll loops with vector instructions under MVE, but that does not include the remainder loops that the vectorizer produces. These remainder loops will be rarely executed and are not worth unrolling, as the trip count is likely to be low if they get executed at all. Luckily they get llvm.loop.isvectorized to make recognizing them simpler. We have wanted to do this for a while but hit issues with low overhead loops being reverted due to difficult registry allocation. With recent changes that seems to be less of an issue now. Differential Revision: https://reviews.llvm.org/D90055	2020-11-10 17:01:31 +00:00
Tim Northover	f7fe7ea24d	[MergeFunctions] fix function attribute comparison in FunctionComparator The comparison of AttributeSets stopped after seeing a matching type attribute. Subsequent mismatching attributes were not detected causing a crash.	2020-11-09 09:19:11 +00:00
Kazu Hirata	75e46c6328	[Mem2Reg] Use llvm::count instead of std::count (NFC)	2020-11-07 20:18:47 -08:00
Atmn Patel	04a0896487	Revert "[LoopDeletion] Allows deletion of possibly infinite side-effect free loops" This reverts commit `0b17c6e447`. This patch causes a compile-time error in SCEV.	2020-11-07 00:32:12 -05:00
Atmn Patel	0b17c6e447	[LoopDeletion] Allows deletion of possibly infinite side-effect free loops From C11 and C++11 onwards, a forward-progress requirement has been introduced for both languages. In the case of C, loops with non-constant conditionals that do not have any observable side-effects (as defined by 6.8.5p6) can be assumed by the implementation to terminate, and in the case of C++, this assumption extends to all functions. The clang frontend will emit the `mustprogress` function attribute for C++ functions (D86233, D85393, D86841) and emit the loop metadata `llvm.loop.mustprogress` for every loop in C11 or later that has a non-constant conditional. This patch modifies LoopDeletion so that only loops with the `llvm.loop.mustprogress` metadata or loops contained in functions that are required to make progress (`mustprogress` or `willreturn`) are checked for observable side-effects. If these loops do not have an observable side-effect, then we delete them. Loops without observable side-effects that do not satisfy the above conditions will not be deleted. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86844	2020-11-06 22:06:58 -05:00
Atmn Patel	babc224c5d	[LoopDeletion] Remove dead loops with no exit blocks Currently, LoopDeletion refuses to remove dead loops with no exit blocks because it cannot statically determine the control flow after it removes the block. This leads to miscompiles if the loop is an infinite loop and should've been removed. Differential Revision: https://reviews.llvm.org/D90115	2020-11-06 17:08:34 -05:00
Giorgis Georgakoudis	700d2417d8	[CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers CodeExtractor handles bitcasts in the extracted region that have lifetime markers users in the outer region as outputs. That creates unnecessary alloca/reload instructions and extra lifetime markers. The patch identifies those cases, and replaces uses in out-of-region lifetime markers with new bitcasts in the outer region. Example ``` define void @foo() { entry: %0 = alloca i32 br label %extract extract: %1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 4, i8* %1) call void @use(i32* %0) br label %exit exit: call void @use(i32* %0) call void @llvm.lifetime.end.p0i8(i64 4, i8* %1) ret void } ``` Current extraction ``` define void @foo() { entry: %.loc = alloca i8, align 8 %0 = alloca i32, align 4 br label %codeRepl codeRepl: ; preds = %entry %lt.cast = bitcast i8* %.loc to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast) %lt.cast1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1) call void @foo.extract(i32* %0, i8** %.loc) %.reload = load i8, i8* %.loc, align 8 call void @llvm.lifetime.end.p0i8(i64 -1, i8* %lt.cast) br label %exit exit: ; preds = %codeRepl call void @use(i32* %0) call void @llvm.lifetime.end.p0i8(i64 4, i8* %.reload) ret void } define internal void @foo.extract(i32* %0, i8** %.out) { newFuncRoot: br label %extract exit.exitStub: ; preds = %extract ret void extract: ; preds = %newFuncRoot %1 = bitcast i32* %0 to i8* store i8* %1, i8** %.out, align 8 call void @use(i32* %0) br label %exit.exitStub } ``` Extraction with patch ``` define void @foo() { entry: %0 = alloca i32, align 4 br label %codeRepl codeRepl: ; preds = %entry %lt.cast1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1) call void @foo.extract(i32* %0) br label %exit exit: ; preds = %codeRepl call void @use(i32* %0) %lt.cast = bitcast i32* %0 to i8* call void @llvm.lifetime.end.p0i8(i64 4, i8* %lt.cast) ret void } define internal void @foo.extract(i32* %0) { newFuncRoot: br label %extract exit.exitStub: ; preds = %extract ret void extract: ; preds = %newFuncRoot %1 = bitcast i32* %0 to i8* call void @use(i32* %0) br label %exit.exitStub } ``` Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D90689	2020-11-05 17:01:08 -08:00
Sjoerd Meijer	7eb70158e4	[IndVarSimplify][SimplifyIndVar] Move WidenIV to Utils/SimplifyIndVar. NFCI. This moves WidenIV from IndVarSimplify to Utils/SimplifyIndVar so that we have createWideIV available as a generic helper utility. I.e., this is not only useful in IndVarSimplify, but could be useful for loop transformations. For example, motivation for this refactoring is the loop flatten transformation: if induction variables in a loop nest can be widened, we can avoid having to perform certain overflow checks, enabling this transformation. Differential Revision: https://reviews.llvm.org/D90421	2020-11-05 16:52:47 +00:00
Xun Li	7f34aca083	[musttail] Unify musttail call preceding return checking There is already an API in BasicBlock that checks and returns the musttail call if it precedes the return instruction. Use it instead of manually checking in each place. Differential Revision: https://reviews.llvm.org/D90693	2020-11-03 11:39:27 -08:00
Jameson Nash	59a6ab28c4	[GVN] small improvements to comments	2020-11-03 13:21:48 -05:00
Fangrui Song	98b9338588	[Debugify] Port -debugify-each to NewPM Preemptively switch 2 tests to the new PM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D90365	2020-11-02 08:16:43 -08:00
Florian Hahn	b3b993a7ad	Reland "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts the revert commit `408c4408fa`. This version of the patch includes a fix for a crash caused by treating ICmp/FCmp constant expressions as instructions. Original message: On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV.	2020-11-02 15:39:29 +00:00
Nikita Popov	27f647d117	[Inliner] Consistently apply callsite noalias metadata Previously, !noalias and !alias.scope metadata on the call site was applied as part of CloneAliasScopeMetadata(), which short-circuits if the callee does not use any noalias metadata itself. However, these two things have no relation to each other. Consistently apply !noalias and !alias.scope metadata by integrating this into an existing function that handled !llvm.access.group and !llvm.mem.parallel_loop_access metadata. The handling for all of these metadata kinds essentially the same.	2020-10-31 10:54:45 +01:00
Arthur Eubanks	5c31b8b94f	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `10f2a0d662`. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Florian Hahn	408c4408fa	Revert "[TTI] Add VecPred argument to getCmpSelInstrCost." This reverts commit `73f01e3df5`. This appears to break http://lab.llvm.org:8011/#/builders/85/builds/383.	2020-10-30 21:26:14 +00:00
Arthur Eubanks	10f2a0d662	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
Pedro Tammela	70a495c7f0	[NFC][LoopSimplify] modernize for loops over LoopInfo This patch modifies two for loops to use the range based syntax. Since they are equivalent, this patch is tagged NFC. Differential Revision: https://reviews.llvm.org/D90069	2020-10-30 16:50:07 +00:00
Simon Pilgrim	ed577892cf	Use cast<> instead of dyn_cast<> as we dereference the pointers immediately. NFCI. Fix clang static analyzer warnings - we're better off relying on cast<> asserting on failure rather than a null dereference crash.	2020-10-30 15:20:40 +00:00
Simon Pilgrim	b7c91a9b8e	[SCEV] SCEVExpander::InsertNoopCastOfTo - reduce scope of pointer type. NFCI. By reducing the scope of the dyn_cast<PointerType> we can make this a cast<PointerType> and avoid clang static analyzer null deference warnings.	2020-10-30 14:55:09 +00:00
Florian Hahn	73f01e3df5	[TTI] Add VecPred argument to getCmpSelInstrCost. On some targets, like AArch64, vector selects can be efficiently lowered if the vector condition is a compare with a supported predicate. This patch adds a new argument to getCmpSelInstrCost, to indicate the predicate of the feeding select condition. Note that it is not sufficient to use the context instruction when querying the cost of a vector select starting from a scalar one, because the condition of the vector select could be composed of compares with different predicates. This change greatly improves modeling the costs of certain compare/select patterns on AArch64. I am also planning on putting up patches to make use of the new argument in SLPVectorizer & LV. Reviewed By: dmgreen, RKSimon Differential Revision: https://reviews.llvm.org/D90070	2020-10-30 13:49:08 +00:00
Roman Lebedev	81fc53a36a	[SCEV] Introduce SCEVPtrToIntExpr (PR46786) And use it to model LLVM IR's `ptrtoint` cast. This is essentially an alternative to D88806, but with no chance for all the problems it caused due to having the cast as implicit there. (see rG7ee6c402474a2f5fd21c403e7529f97f6362fdb3) As we've established by now, there are at least two reasons why we want this: * It will allow SCEV to actually model the `ptrtoint` casts and their operands, instead of treating them as `SCEVUnknown` * It should help with initial problem of PR46786 - this should eventually allow us to not loose pointer-ness of an expression in more cases As discussed in [[ https://bugs.llvm.org/show_bug.cgi?id=46786 \| PR46786 ]], in principle, we could just extend `SCEVUnknown` with a `is ptrtoint` cast, because `ScalarEvolution::getPtrToIntExpr()` should sink the cast as far down into the expression as possible, so in the end we should always end up with `SCEVPtrToIntExpr` of `SCEVUnknown`. But i think that it isn't the best solution, because it doesn't really matter from memory consumption side - there probably won't be that many `SCEVPtrToIntExpr`s for it to matter, and it allows for much better discoverability. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89456	2020-10-30 11:13:35 +03:00
Stefanos Baziotis	a3345300b6	[LCSSA] Doc for special treatment of PHIs Differential Revision: https://reviews.llvm.org/D89739	2020-10-29 22:50:07 +02:00
Nikita Popov	20b386aae0	[LoopUtils] Fix neutral value for vector.reduce.fadd Use -0.0 instead of 0.0 as the start value. The previous use of 0.0 was fine for all existing uses of this function though, as it is always generated with fast flags right now, and thus nsz.	2020-10-29 21:45:13 +01:00
Dávid Bolvanský	7a2abf5aca	[InferAttrs] Add nocapture/writeonly to string/mem libcalls One step closer to fix PR47644. Differential Revision: https://reviews.llvm.org/D89645	2020-10-29 20:06:43 +01:00
Max Kazantsev	a5b2e795c3	[NFC][SCEV] Refactor monotonic predicate checks to return enums instead of bools This patch gets rid of output parameter which is not needed for most users and prepares this API for further refactoring.	2020-10-29 16:01:25 +07:00
Fangrui Song	39856d5d0b	[Debugify] Move global namespace functions into llvm:: Also move exportDebugifyStats from tools/opt to Debugify.cpp	2020-10-28 19:11:41 -07:00
Vedant Kumar	5a3ef55a52	[Utils] Skip RemoveRedundantDbgInstrs in MergeBlockIntoPredecessor (PR47746) This patch changes MergeBlockIntoPredecessor to skip the call to RemoveRedundantDbgInstrs, in effect partially reverting D71480 due to some compile-time issues spotted in LoopUnroll and SimplifyCFG. The call to RemoveRedundantDbgInstrs appears to have changed the worst-case behavior of the merging utility. Loosely speaking, it seems to have gone from O(#phis) to O(#insts). It might not be possible to mitigate this by scanning a block to determine whether there are any debug intrinsics to remove, since such a scan costs O(#insts). So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly little fallout from this, and most of it can be addressed by doing RemoveRedundantDbgInstrs later. The exception is (the block-local version of) SimplifyCFG, where it might just be too expensive to call RemoveRedundantDbgInstrs. Differential Revision: https://reviews.llvm.org/D88928	2020-10-27 10:12:59 -07:00
Simon Pilgrim	bce770ffa6	Revert rG0905bd5c2fa42bd4c "[InstCombine] collectBitParts - add trunc support." This reverts commit `0905bd5c2f`. Causing failures in multistage buildbots that I need to investigate	2020-10-27 13:43:54 +00:00
Nico Weber	2a4e704c92	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `e5766f25c6`. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.	2020-10-27 09:26:21 -04:00
Simon Pilgrim	0905bd5c2f	[InstCombine] collectBitParts - add trunc support. This should allow us to remove the rather limited matchOrConcat fold and just use recognizeBSwapOrBitReverseIdiom.	2020-10-27 13:14:54 +00:00
Arthur Eubanks	e5766f25c6	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-26 20:24:04 -07:00
Sriraman Tallam	ad1b9daa4b	Prepend "__uniq" to symbol names hash with -funique-internal-linkage-names. Prepend the module name hash with a fixed string ".__uniq." which helps tools that consume sampled profiles and attribute it to functions to understand that this symbol belongs to a unique internal linkage type symbol. Symbols with suffixes can result from various optimizations in the compiler. Function Multiversioning, function splitting, parameter constant propogation, unique internal linkage names. External tools like sampled profile aggregators combine profiles from multiple runs of a binary. They use various heuristics with symbols that have suffixes to try and attribute the profile to the right function instance. For instance multi-versioned symbols like foo.avx, foo.sse4.2, etc even though different should be attributed to the same source function if a single function is versioned, using attribute target_clones (supported in GCC but yet to land in LLVM). Similarly, functions that are split (split part having a .cold suffix) could have profiles for both the original and split symbols but would be aggregated and attributed to the original function that was split. Unique internal linkage functions however have different source instances and the aggregator must not put them together but attribute it to the appropriate function instance. To be sure that we are dealing with a symbol of a unique internal linkage function, we would like to prepend the hash with a known string ".__uniq." which these tools can check to understand the suffix type. Differential Revision: https://reviews.llvm.org/D89617	2020-10-26 14:24:28 -07:00
Simon Pilgrim	532f3bec3e	[InstCombine] collectBitParts - add bitreverse intrinsic support.	2020-10-26 14:36:36 +00:00
TaWeiTu	060a4fccf1	[LoopVersioning] Form dedicated exits for versioned loop to preserve simplify form The exit blocks of the versioned and non-versioned loops are not dedicated and thus the two loops are not in simplify form. Insert dummy exit blocks after loop versioning with `formDedicatedExits()` to preserve the simplify form for subsequence passes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D89569	2020-10-24 21:40:46 +08:00
Artur Pilipenko	6ec2c5e402	GC-parseable element atomic memcpy/memmove This change introduces a GC parseable lowering for element atomic memcpy/memmove intrinsics. This way runtime can provide an implementation which can take a safepoint during copy operation. See "GC-parseable element atomic memcpy/memmove" thread on llvm-dev for the background and details: https://groups.google.com/g/llvm-dev/c/NnENHzmX-b8/m/3PyN8Y2pCAAJ Differential Revision: https://reviews.llvm.org/D88861	2020-10-23 14:06:09 -07:00
Nick Desaulniers	b7926ce6d7	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
OCHyams	fea067bdfd	[mem2reg] Remove dbg.values describing contents of dead allocas This patch copies @vsk's fix to instcombine from D85555 over to mem2reg. The motivation and rationale are exactly the same: When mem2reg removes an alloca, it erases the dbg.{addr,declare} instructions which refer to the alloca. It would be better to instead remove all debug intrinsics which describe the contents of the dead alloca, namely all dbg.value(<dead alloca>, ..., DW_OP_deref)'s. As far as I can tell, prior to D80264 these `dbg.value+deref`s would have been silently dropped instead of being made `undef`, so we're just returning to previous behaviour with these patches. Testing: `llvm-lit llvm/test` and `ninja check-clang` gave no unexpected failures. Added 3 tests, each of which covers a dbg.value deletion path in mem2reg: mem2reg-promote-alloca-1.ll mem2reg-promote-alloca-2.ll mem2reg-promote-alloca-3.ll The first is based on the dexter test inlining.c from D89543. This patch also improves the debugging experience for loop.c from D89543, which suffers similarly after arg promotion instead of inlining.	2020-10-23 04:46:56 +00:00
Caroline Concatto	2415636475	[SVE]Clarify TypeSize comparisons in llvm/lib/Transforms Use isKnownXY comparators when one of the operands can be with scalable vectors or getFixedSize() for all the other cases. This patch also does bug fixes for getPrimitiveSizeInBits by using getFixedSize() near the places with the TypeSize comparison. Differential Revision: https://reviews.llvm.org/D89703	2020-10-23 09:15:17 +01:00
Vedant Kumar	099bffe7f7	Revert "[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI)" This reverts commit `26ee8aff2b`. It's necessary to insert bitcast the pointer operand of a lifetime marker if it has an opaque pointer type. rdar://70560161	2020-10-22 12:25:50 -07:00
Arthur Eubanks	92d9a3868a	Port -instnamer to NPM Some clang tests use this. Reviewed By: akhuang Differential Revision: https://reviews.llvm.org/D89931	2020-10-22 12:08:36 -07:00
Zequan Wu	2f29341114	Revert "Revert "SimplifyCFG: Clean up optforfuzzing implementation"" This reverts commit `716f7636e1`.	2020-10-21 17:08:56 -07:00
Zequan Wu	716f7636e1	Revert "SimplifyCFG: Clean up optforfuzzing implementation" See discussion: https://reviews.llvm.org/D89590 This reverts commit `cdd006eec9`.	2020-10-21 16:56:32 -07:00
Geoffrey Martin-Noble	c17ae2916c	Remove unnecessary header include which violates layering This was introduced in https://reviews.llvm.org/D89774, but I don't think it should be necessary. Reviewed By: TaWeiTu, aeubanks Differential Revision: https://reviews.llvm.org/D89843	2020-10-20 20:14:03 -07:00
Ta-Wei Tu	529ecd19df	[NPM] port -unify-loop-exits to NPM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D89774	2020-10-20 10:46:57 -07:00
Ta-Wei Tu	59286b36df	[NPM] Port -mergereturn to NPM Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D89781	2020-10-20 10:33:58 -07:00
Atmn Patel	595c615606	[IR] Adds mustprogress as a LLVM IR attribute This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D85393	2020-10-20 03:09:57 -04:00
Jordan Rupprecht	8a377f1e3c	[NFC] Inline assertion-only variable	2020-10-19 15:11:37 -07:00
Roman Lebedev	e0567582b8	[NFCI][SCEV] Always refer to enum SCEVTypes as enum, not integer The main tricky thing here is forward-declaring the enum: we have to specify it's underlying data type. In particular, this avoids the danger of switching over the SCEVTypes, but actually switching over an integer, and not being notified when some case is not handled. I have updated most of such switches to be exaustive and not have a default case, where it's pretty obvious to be the intent, however not all of them.	2020-10-20 00:10:22 +03:00
Roman Lebedev	3355284b2d	[NFC][SCEVExpander] isHighCostExpansionHelper(): rewrite as a switch If we switch over an enum, compiler can easily issue a diagnostic if some case is not handled. However with an if cascade that isn't so. Experimental evidence suggests new behavior to be superior.	2020-10-20 00:10:22 +03:00
Roman Lebedev	d083d55c2c	[NFC][SCEV] Rename SCEVCastExpr into SCEVIntegralCastExpr All existing SCEV cast types operate on integers. D89456 will add SCEVPtrToIntExpr cast expression type. I believe this is best for consistency. Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D89455	2020-10-19 10:59:53 +03:00
Dávid Bolvanský	65e94cc946	[InferAttrs] Add argmemonly attribute to string libcalls Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89602	2020-10-18 01:33:26 +02:00
Dávid Bolvanský	2a75e956e5	Revert "[InferAttrs] Add argmemonly attribute to string libcalls" This reverts commit `b77dd32a6f`. Sanitizer tests are broken.	2020-10-17 23:29:02 +02:00
Dávid Bolvanský	b77dd32a6f	[InferAttrs] Add argmemonly attribute to string libcalls Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89602	2020-10-17 22:42:36 +02:00
Matt Arsenault	0a7cd99a70	Reapply "OpaquePtr: Add type to sret attribute" This reverts commit `eb9f7c28e5`. Previously this was incorrectly handling linking of the contained type, so this merges the fixes from D88973.	2020-10-16 11:05:02 -04:00
Michael Liao	98f254960f	[globalopt] Teach to look through `addrspacecast`. - so that global variables in numbered address spaces could be properly analyzed. Differential Revision: https://reviews.llvm.org/D89140	2020-10-16 08:43:09 -04:00
Florian Hahn	89c0124273	[LoopVersion] Unify SCEVChecks and alias check handling (NFC). This is an initial cleanup of the way LoopVersioning interacts with LAA. Currently LoopVersioning has 2 ways of initializing things: 1. Passing LAI and passing UseLAIChecks = true 2. Passing UseLAIChecks = false, followed by calling setSCEVChecks and setAliasChecks. Both ways of initializing lead to the same result and the duplication seems more complicated than necessary. This patch removes the UseLAIChecks flag from the constructor and the setSCEVChecks & setAliasChecks helpers and move initialization exclusively to the constructor. This simplifies things, by providing a single way to initialize LoopVersioning and reducing duplication. Reviewed By: Meinersbur, lebedev.ri Differential Revision: https://reviews.llvm.org/D84406	2020-10-15 22:02:17 +01:00
Roman Lebedev	7ee6c40247	Revert "Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown"" and it's follow-ups While we haven't encountered an earth-shattering problem with this yet, by now it is pretty evident that trying to model the ptr->int cast implicitly leads to having to update every single place that assumed no such cast could be needed. That is of course the wrong approach. Let's back this out, and re-attempt with some another approach, possibly one originally suggested by Eli Friedman in https://bugs.llvm.org/show_bug.cgi?id=46786#c20 which should hopefully spare us this pain and more. This reverts commits `1fb6104293`, `7324616660`, `aaafe350bb`, `e92a8e0c74`. I've kept&improved the tests though.	2020-10-14 16:09:18 +03:00
Juneyoung Lee	9b3c2a72e4	[ValueTracking] Use assume's noundef operand bundle This patch updates `isGuaranteedNotToBeUndefOrPoison` to use `llvm.assume`'s `noundef` operand bundle. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89219	2020-10-14 20:16:33 +09:00
Roman Lebedev	1fb6104293	Reland "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" This relands commit `1c021c64ca` which was reverted in commit `17cec6a11a` because an assertion was being triggered, since `BuildConstantFromSCEV()` wasn't updated to handle the case where the constant we want to truncate is actually a pointer. I was unsuccessful in coming up with a test case where we'd end there with constant zext/sext of a pointer, so i didn't handle those cases there until there is a test case. Original commit message: While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 23:02:55 +03:00
Hans Wennborg	17cec6a11a	Revert `1c021c64c` "[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown" > While we indeed can't treat them as no-ops, i believe we can/should > do better than just modelling them as `unknown`. `inttoptr` story > is complicated, but for `ptrtoint`, it seems straight-forward > to model it just as a zext-or-trunc of unknown. > > This may be important now that we track towards > making inttoptr/ptrtoint casts not no-op, > and towards preventing folding them into loads/etc > (see D88979/D88789/D88788) > > Reviewed By: mkazantsev > > Differential Revision: https://reviews.llvm.org/D88806 It caused the following assert during Chromium builds: llvm/lib/IR/Constants.cpp:1868: static llvm::Constant llvm::ConstantExpr::getTrunc(llvm::Constant , llvm::Type *, bool): Assertion `C->getType()->isIntOrIntVectorTy() && "Trunc operand must be integer"' failed. See code review for a link to a reproducer. This reverts commit `1c021c64ca`.	2020-10-12 18:39:35 +02:00
Florian Hahn	ad5541045a	[LoopDeletion] Remove over-eager SCEV verification. `60b852092c` introduced SCEV verification to deleteDeadLoop, but it appears this check is currently a bit over-eager and some users of deleteDeadLoop appear to only patch up SE after calling it (e.g. PR47753). Remove the extra check for now. We can consider adding it back after we tracked down the source of the inconsistency for PR47753.	2020-10-12 16:18:30 +01:00
Roman Lebedev	1c021c64ca	[SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown While we indeed can't treat them as no-ops, i believe we can/should do better than just modelling them as `unknown`. `inttoptr` story is complicated, but for `ptrtoint`, it seems straight-forward to model it just as a zext-or-trunc of unknown. This may be important now that we track towards making inttoptr/ptrtoint casts not no-op, and towards preventing folding them into loads/etc (see D88979/D88789/D88788) Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D88806	2020-10-12 11:04:03 +03:00
Arthur Eubanks	0689dab844	[FixIrreducible][NewPM] Port -fix-irreducible to NPM In the NPM, a pass cannot depend on another non-analysis pass. So pin the test that tests that -lowerswitch is run automatically to legacy PM. Reviewed By: sameerds Differential Revision: https://reviews.llvm.org/D89051	2020-10-09 09:22:09 -07:00
Simon Pilgrim	8f0658ae67	[Transforms] CodeExtractor::verifyAssumptionCache - don't dereference a dyn_cast<>. NFCI. Use cast<> as we immediately dereference the pointer afterwards - cast<> will assert if we fail. Prevents clang static analyzer warning that we could deference a null pointer.	2020-10-08 19:04:30 +01:00
Reid Kleckner	940d7aaea9	Port StripGCRelocates pass to NPM Fixes one test under NPM Differential Revision: https://reviews.llvm.org/D88766	2020-10-07 14:41:29 -07:00
Reid Kleckner	da48fe1732	[NPM] Port strip nonlinetable debuginfo pass to the new pass manager Fixes a few tests in llvm/test/Transforms/Utils. Differential Revision: https://reviews.llvm.org/D88762	2020-10-07 14:35:36 -07:00
Dávid Bolvanský	86429c4eaf	[SimplifyLibCalls] Optimize mempcpy_chk to mempcpy	2020-10-06 17:08:46 +02:00
Dávid Bolvanský	a4bae56ab8	Revert "[SLC] Optimize mempcpy_chk to mempcpy" This reverts commit `3f1fd59de3`.	2020-10-05 22:27:14 +02:00
Dávid Bolvanský	3f1fd59de3	[SLC] Optimize mempcpy_chk to mempcpy As reported in PR46735: void* f(void d, const void s, size_t l) { return __builtin___mempcpy_chk(d, s, l, __builtin_object_size(d, 0)); } This can be optimized to `return mempcpy(d, s, l);`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86019	2020-10-05 22:18:36 +02:00
Simon Pilgrim	aacfe2be53	[InstCombine] recognizeBSwapOrBitReverseIdiom - add vector support Add basic vector handling to recognizeBSwapOrBitReverseIdiom/collectBitParts - this works at the element level, all vector element operations must match (splat constants etc.) and there is no cross-element support (insert/extract/shuffle etc.).	2020-10-03 16:26:46 +01:00
Simon Pilgrim	347fd9955a	[InstCombine] recognizeBSwapOrBitReverseIdiom - use generic CreateIntegerCast Try to appease buildbots breakages due to D88578	2020-10-03 15:29:22 +01:00
Simon Pilgrim	3aa93f690b	[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) (Reapplied) If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern. Reapplied with early-out if recognizeBSwapOrBitReverseIdiom collects a source wider than the result type. Differential Revision: https://reviews.llvm.org/D88578	2020-10-03 14:52:42 +01:00
Arthur Eubanks	321986fe68	[MetaRenamer][NewPM] Port metarenamer to NPM Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D88690	2020-10-02 15:42:25 -07:00
Simon Pilgrim	0364721e3e	Revert rG3d14a1e982ad27 - "[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191)" This reverts commit `3d14a1e982`. This is breaking on some 2stage clang buildbots	2020-10-02 18:17:14 +01:00
Simon Pilgrim	3d14a1e982	[InstCombine] recognizeBSwapOrBitReverseIdiom - support for 'partial' bswap patterns (PR47191) If we're bswap'ing some bytes and zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern. Differential Revision: https://reviews.llvm.org/D88578	2020-10-02 17:25:12 +01:00
Philip Reames	f29645e7af	[gvn] Handle a corner case w/vectors of non-integral pointers If we try to coerce a vector of non-integral pointers to a narrower type (either narrower vector or single pointer), we use inttoptr and violate the semantics of non-integral pointers. In theory, we can handle many of these cases, we just need to use a different code idiom to convert without going through inttoptr and back. This shows up as wrong code bugs, and in some cases, crashes due to failed asserts. Modeled after a change which has lived downstream for a couple years, though completely rewritten to be more idiomatic.	2020-10-01 19:20:21 -07:00
Simon Pilgrim	29ac9fae54	[InstCombine] collectBitParts - convert to use PatterMatch matchers and avoid IntegerType casts. Make sure we're using getScalarSizeInBits instead of cast<IntegerType> to get Type bit widths. This is preliminary cleanup before we can start adding vector support to the bswap/bitreverse (element level) matching.	2020-10-01 16:44:14 +01:00
Simon Pilgrim	bc730b5e43	[InstCombine] collectBitParts - use APInt directly to check for out of range bit shifts. NFCI.	2020-10-01 12:50:36 +01:00
Simon Pilgrim	c722b32596	[InstCombine] recognizeBSwapOrBitReverseIdiom - merge the regular/trunc+zext paths. NFCI. There doesn't seem to be any good reason for having a separate path for when we bswap/bitreverse at a smaller size than the destination size - so merge these to make the instruction generation a lot clearer.	2020-09-30 14:54:04 +01:00
Simon Pilgrim	d5545a8993	[InstCombine] recognizeBSwapOrBitReverseIdiom - remove unnecessary cast. NFCI.	2020-09-30 14:44:15 +01:00
Simon Pilgrim	621c6c8962	[InstCombine] recognizeBSwapOrBitReverseIdiom - cleanup bswap/bitreverse detection loop. NFCI. Early out if both pattern matches have failed (or we don't want them). Fix case of bit index iterator (and avoid Wshadow issue).	2020-09-30 14:19:18 +01:00
Simon Pilgrim	413b4998bd	[InstCombine] recognizeBSwapOrBitReverseIdiom - use ArrayRef::back() helper. NFCI. Post-commit feedback on D88316	2020-09-30 13:39:18 +01:00
Simon Pilgrim	05290eead3	InstCombine] collectBitParts - cleanup variable names. NFCI. Fix a number of WShadow warnings (I was used as the instruction and index......) and fix cases to match style. Also, replaced the Bit APInt mask check in AND instructions with a direct APInt[] bit check.	2020-09-30 13:25:32 +01:00
Simon Pilgrim	af47d40b9c	[InstCombine] recognizeBSwapOrBitReverseIdiom - recognise zext(bswap(trunc(x))) patterns (PR39793) PR39793 demonstrated an issue where we fail to recognize 'partial' bswap patterns of the lower bytes of an integer source. In fact, most of this is already in place collectBitParts suitably tags zero bits, so we just need to correctly handle this case by finding the zero'd upper bits and reducing the bswap pattern just to the active demanded bits. Differential Revision: https://reviews.llvm.org/D88316	2020-09-30 12:07:19 +01:00
Simon Pilgrim	ec3f24d453	[InstCombine] recognizeBSwapOrBitReverseIdiom - assert for correct bit providence indices. NFCI. As suggested by @spatel on D88316	2020-09-30 11:16:33 +01:00
Jeremy Morse	05659606a2	Revert "[gardening] Replace some uses of setDebugLoc(DebugLoc()) with dropLocation(), NFC" Some of the buildbots have croaked with this patch, for examples failures that begin in this build: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/29933 This reverts commit `674f57870f`.	2020-09-30 09:52:12 +01:00
Vedant Kumar	674f57870f	[gardening] Replace some uses of setDebugLoc(DebugLoc()) with dropLocation(), NFC	2020-09-29 17:39:07 -07:00
Vedant Kumar	26ee8aff2b	[CodeExtractor] Don't create bitcasts when inserting lifetime markers (NFCI) Lifetime marker intrinsics support any pointer type, so CodeExtractor does not need to bitcast to `i8*` in order to use these markers.	2020-09-29 16:34:36 -07:00
Juneyoung Lee	67aac915ba	[BuildLibCalls] Add noundef to the returned pointers of allocators and argument of free This patch adds noundef to the returned pointers of allocators (malloc, calloc, ...) and the pointer argument of free. The returned pointer of allocators cannot be poison or (partially) undef. Since the pointer that is given to free should precisely have zero offset, it cannot be poison or (partially) undef too. For the size arguments of allocators, noundef wasn't attached simply because I wasn't sure whether attaching it is okay or not. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D87984	2020-09-30 02:13:48 +09:00
Florian Hahn	7bae2bc5a8	[LoopUtils] Only verify SE in builds with assertions. Follow up to `60b852092c`.	2020-09-29 13:39:23 +01:00
David Stenberg	e6f332ef1e	[IndVarSimplify] Fix Modified status for removal of overflow intrinsics When removing an overflow intrinsic the Changed status in SimplifyIndvar was not set, leading to the IndVarSimplify pass returning an incorrect status. This was caught using the check introduced by D80916. As pointed out in the code review, a similar bug may exist for eliminateTrunc(). Reviewed By: reames Differential Revision: https://reviews.llvm.org/D85971	2020-09-29 13:20:59 +02:00
Florian Hahn	60b852092c	[LoopDeletion] Forget loop before setting values to undef After D71539, we need to forget the loop before setting the incoming values of phi nodes in exit blocks, because we are looking through those phi nodes now and the SCEV expression could depend on the loop phi. If we update the phi nodes before forgetting the loop, we miss those users during invalidation. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D88167	2020-09-29 10:38:44 +01:00
Dávid Bolvanský	155ac33394	[BuildLibCalls] Add noalias for strcat and stpcpy strcat: destination and source shall not overlap. (http://www.cplusplus.com/reference/cstring/strcat/) stpcpy: The strings may not overlap, and the destination string dest must be large enough to receive the copy. (https://man7.org/linux/man-pages/man3/stpcpy.3.html) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D88335	2020-09-27 21:37:09 +02:00
Nikita Popov	9b959b59df	[LVI] Require context instruction in external API (NFCI) Require CxtI in getConstant() and getConstantRange() APIs. Accordingly drop the BB parameter, as it is implied by CxtI->getParent(). This makes sure we don't forget to pass the context instruction, and makes the API contract clearer (also clean up the comments to that effect -- the value holds at the context instruction, not the end of the block).	2020-09-27 18:07:24 +02:00
Simon Pilgrim	2a0ca17f66	[InstCombine] collectBitParts - add fshl/fshr handling Pulled from D87452, this is a fixed version of the collectBitParts fshl/fshr handling which as @nikic noticed wasn't checking for different providers or had correct bit ordering (which was hid by only testing shift amounts of bitwidth/2). Differential Revision: https://reviews.llvm.org/D88292	2020-09-25 20:34:59 +01:00
Arthur Eubanks	6b1ce83a12	[NewPM][CGSCC] Handle newly added functions in updateCGAndAnalysisManagerForPass This seems to fit the CGSCC updates model better than calling addNewFunctionInto{Ref,}SCC() on newly created/outlined functions. Now addNewFunctionInto{Ref,}SCC() are no longer necessary. However, this doesn't work on newly outlined functions that aren't referenced by the original function. e.g. if a() was outlined into b() and c(), but c() is only referenced by b() and not by a(), this will trigger an assert. This also fixes an issue I was seeing with newly created functions not having passes run on them. Ran check-llvm with expensive checks. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D87798	2020-09-23 15:22:18 -07:00
Hubert Tong	32c9991dab	[InstCombine] Fix errno bug in pow expansion to sqrt A conversion from `pow` to `sqrt` shall not call an `errno`-setting `sqrt` with -//infinity//: the `sqrt` will set `EDOM` where the `pow` call need not. This patch avoids the erroneous (pun not intended) transformation by applying the restrictions discussed in the thread for https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html. The existing tests are updated (depending on emphasis in the checks for library calls, avoidance of overlap, and overall coverage): - to add `ninf`, retaining the intended library call, - to use the intrinsic, retaining the use of `select`, or - to expect the replacement to not occur. The following is tested: - The pow intrinsic folds to a `select` instruction to handle -//infinity//. - The pow library call folds, with `ninf`, to `sqrt` without the `select` instruction associated with handling -//infinity//. - The pow library call does not fold to `sqrt` without `ninf`. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D87877	2020-09-22 18:58:05 -04:00
Stefanos Baziotis	89c1e35f3c	[LoopInfo] empty() -> isInnermost(), add isOutermost() Differential Revision: https://reviews.llvm.org/D82895	2020-09-22 23:28:51 +03:00
Hubert Tong	6801950192	[InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case The current code for handling pow(x, y) where y is an integer plus 0.5 is not explicitly guarded against attempting to transform the case where abs(y) is exactly 0.5. The latter case is meant to be handled by `replacePowWithSqrt`. Indeed, if the pow(x, integer+0.5) case proceeds past a certain point, it will hit an assertion by attempting to form pow(x, 0) using `getPow`. This patch adds an explicit check to prevent attempting the pow(x, integer+0.5) transformation on pow(x, +/-0.5) as suggested during the review of D87877. This has the effect of retaining the shrinking of `pow` to `powf` when the `sqrt` libcall cannot be formed. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88066	2020-09-22 14:23:32 -04:00
Fangrui Song	6913812abc	Fix some clang-tidy bugprone-argument-comment issues	2020-09-19 20:41:25 -07:00
Nikita Popov	f4e5541809	[Local] Clean up enforceKnownAlignment() (NFC) I want to export this function, and the current API was a bit weird: It took an additional Alignment argument that didn't really have anything to do with what the function does. Drop it, and perform a max at the callsite. Also rename it to tryEnforceAlignment().	2020-09-19 22:29:40 +02:00
Florian Hahn	1d8f2e5292	[SCEVExpander] Support expanding nonintegral pointers with constant base. Currently SCEVExpander creates inttoptr for non-integral pointers if the base is a null constant for example. This results in invalid IR. This patch changes InsertNoopCastOfTo to emit a GEP & bitcast to convert to a non-integral pointer. First, a GEP of i8* null is generated and the integral value is used as index. The GEP is then bitcasted to the target type. This was exposed by D71539. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87827	2020-09-19 17:19:53 +01:00
Fangrui Song	76eec6c95b	[SCEV] Fix an unused variable in -DLLVM_ENABLE_ASSERTIONS=off build	2020-09-18 16:19:05 -07:00
Roman Lebedev	aadf55d1ce	[NFC] EliminateDuplicatePHINodes(): small-size optimization: if there are <= 32 PHI's, O(n^2) algo is faster (geomean -0.08%) This is functionally equivalent to the old implementation. As per https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=4739e6e4eb54d3736e6457249c0919b30f6c855a&stat=instructions this is a clear geomean compile-time regression-free win with overall geomean of `-0.08%` 32 PHI's appears to be the sweet spot; both the 16 and 64 performed worse: https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=c4efe1fbbfdf0305ac26cd19eacb0c7774cdf60e&stat=instructions https://llvm-compile-time-tracker.com/compare.php?from=5f4e9bf6416e45eba483a4e5e263749989fdb3b3&to=e4989d1c67010d3339d1a40ff5286a31f10cfe82&stat=instructions If we have more PHI's than that, we fall-back to the original DenseSet-based implementation, so the not-so-fast cases will still be handled. However compile-time isn't the main motivation here. I can name at least 3 limitations of this CSE: 1. Assumes that all PHI nodes have incoming basic blocks in the same order (can be fixed while keeping the DenseMap) 2. Does not special-handle `undef` incoming values (i don't see how we can do this with hashing) 3. Does not special-handle backedge incoming values (maybe can be fixed by hashing backedge as some magical value) Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87408	2020-09-17 11:29:03 +03:00
Arthur Eubanks	f7aa1563eb	[LowerSwitch][NewPM] Port lowerswitch to NPM Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D87726	2020-09-15 18:18:31 -07:00
Wenlei He	2ea4c2c598	[BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults ~~D65060 uncovered that trying to use BFI in loop passes can lead to non-deterministic behavior when blocks are re-used while retaining old BFI data.~~ ~~To make sure BFI is preserved through loop passes a Value Handle (VH) callback is registered on blocks themselves. When a block is freed it now also wipes out the accompanying BFI entry such that stale BFI data can no longer persist resolving the determinism issue. ~~ ~~An optimistic approach would be to incrementally update BFI information throughout the loop passes rather than only invalidating them on removed blocks. The issues with that are:~~ ~~1. It is not clear how BFI information should be incrementally updated: If a block is duplicated does its BFI information come with? How about if it's split/modified/moved around? ~~ ~~2. Assuming we can address these problems the implementation here will be a massive undertaking. ~~ ~~There's a known need of BFI in LICM analysis which requires correct but not incrementally updated BFI data. A follow-up change can register BFI in all loop passes so this preserved but potentially lossy data is available to any loop pass that wants it.~~ See: D75341 for an identical implementation of preserving BFI via VH callbacks. The previous statements do still apply but this change no longer has to be in this diff because it's already upstream 😄 . This diff also moves BFI to be a part of LoopStandardAnalysisResults since the previous method using getCachedResults now (correctly!) statically asserts (D72893) that this data isn't static through the loop passes. Testing Ninja check Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D86156	2020-09-15 16:16:24 -07:00
Xun Li	7b4cc0961b	[TSAN] Handle musttail call properly in EscapeEnumerator (and TSAN) Call instructions with musttail tag must be optimized as a tailcall, otherwise could lead to incorrect program behavior. When TSAN is instrumenting functions, it broke the contract by adding a call to the tsan exit function inbetween the musttail call and return instruction, and also inserted exception handling code. This happend throguh EscapeEnumerator, which adds exception handling code and returns ret instructions as the place to insert instrumentation calls. This becomes especially problematic for coroutines, because coroutines rely on tail calls to do symmetric transfers properly. To fix this, this patch moves the location to insert instrumentation calls prior to the musttail call for ret instructions that are following musttail calls, and also does not handle exception for musttail calls. Differential Revision: https://reviews.llvm.org/D87620	2020-09-15 15:20:05 -07:00
Simon Pilgrim	65c6ae3b6a	[Utils] isLegalToPromote - Fix missing null check before writing to FailureReason. The FailureReason input parameter maybe null, we check this in all other cases in the method but this one was missed somehow. Fixes clang-tidy warning.	2020-09-15 14:49:04 +01:00
Sanjay Patel	aa57c1c967	[InstCombine] fix bug in pow expansion There at least one other bug related to pow -> sqrt transforms: http://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html ...but we probably can't solve that without fixing this first.	2020-09-15 09:29:48 -04:00
Simon Pilgrim	4ff4708d39	collectBitParts - use const references. NFCI. Fixes clang-tidy warnings first noticed on D87452.	2020-09-14 18:23:00 +01:00
Jay Foad	9a4476072e	[UnifyLoopExits] Fix non-deterministic iteration order This was causing random minor codegen differences in shaders compiled with the AMDGPU backend. Differential Revision: https://reviews.llvm.org/D87548	2020-09-14 09:09:58 +01:00
David Sherwood	1e1770a07e	[SVE][CodeGen] Fix InlineFunction for scalable vectors When inlining functions containing allocas of scalable vectors we cannot specify the size in the lifetime markers, since we don't know this at compile time. Added new test here: test/Transforms/Inline/AArch64/sve-alloca-merge.ll Differential Revision: https://reviews.llvm.org/D87139	2020-09-11 08:34:51 +01:00
Sam Parker	0bdf8c9127	[SCEV] Constant expansion cost at minsize As code size is the only thing we care about at minsize, query the cost of materialising immediates when calculating the cost of a SCEV expansion. We also modify the CostKind to TCK_CodeSize for minsize, instead of RecipThroughput. Differential Revision: https://reviews.llvm.org/D76434	2020-09-10 08:21:11 +01:00
David Stenberg	48fc781438	[UnifyFunctionExitNodes] Fix Modified status for unreachable blocks If a function had at most one return block, the pass would return false regardless if an unified unreachable block was created. This patch fixes that by refactoring runOnFunction into two separate helper functions for handling the unreachable blocks respectively the return blocks, as suggested by @bjope in a review comment. This was caught using the check introduced by D80916. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D85818	2020-09-09 13:36:03 +02:00
Juneyoung Lee	36c8621638	[BuildLibCalls] Add more noundef to library functions This patch follows D85345 and adds more noundef attributes to return values/arguments of library functions that are mostly about accessing the file system or processes. A few functions like `chmod` or `times` use typedef `mode_t` and `clock_t`. They are neither struct nor union, so they cannot contain undef even if they're lowered to iN in IR. So, it is fine to add noundef to them. - clock_t's actual type is size_t (C17, 7.27.1.3), so it isn't struct or union. - For mode_t, either int or long is used in practice because programmers use bit manipulation. So, I think it is okay that it's never aggregate in practice. After this patch, the remaining library functions are those that eagerly participate in optimizations: they can be removed, reordered, or introduced by a transformation from primitive IR operations. For them, a few testings is needed, since it may not be valid to add noundef anymore even if C standard says it's okay. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85894	2020-09-09 20:33:35 +09:00
David Stenberg	17dce2fe43	[UnifyFunctionExitNodes] Remove unused getters, NFC The get{Return,Unwind,Unreachable}Block functions in UnifyFunctionExitNodes have not been used for many years, so just remove them. Reviewed By: bjope Differential Revision: https://reviews.llvm.org/D87078	2020-09-08 20:42:28 +02:00
Sam Parker	928c4b4b49	[SCEV] Refactor isHighCostExpansionHelper To enable the cost of constants, the helper function has been reorganised: - A struct has been introduced to hold SCEV operand information so that we know the user of the operand, as well as the operand index. The Worklist now uses instead instead of a bare SCEV. - The costing of each SCEV, and collection of its operands, is now performed in a helper function. Differential Revision: https://reviews.llvm.org/D86050	2020-09-07 11:57:46 +01:00
Sam Parker	65f78e73ad	[SimplifyCFG] Consider cost of combining predicates. Modify FoldBranchToCommonDest to consider the cost of inserting instructions when attempting to combine predicates to fold blocks. The threshold can be controlled via a new option: -simplifycfg-branch-fold-threshold which defaults to '2' to allow the insertion of a not and another logical operator. Differential Revision: https://reviews.llvm.org/D86526	2020-09-07 10:04:50 +01:00
serge-sans-paille	3a6f3fc160	Fix return status of SimplifyCFG When a switch case is folded into default's case, that's an IR change that should be reported, update ConstantFoldTerminator accordingly. Differential Revision: https://reviews.llvm.org/D87142	2020-09-05 07:54:15 +02:00
Roman Lebedev	1dcb936cf6	[NFC][Local] EliminateDuplicatePHINodes(): add STATISTIC()	2020-08-29 22:03:18 +03:00
Roman Lebedev	961483a5ea	[NFCI][Local] Rewrite EliminateDuplicatePHINodes to optionally check hashing invariants EarlyCSE has a mode to verify the invariant that hash equality equals key equality, but EliminateDuplicatePHINodes() doesn't. I've verified that this would have caught the stage2-stage3 mismatches `5ec2b757cc` revert has fixed, that were introduced last time in `3e69871ab5`.	2020-08-29 22:03:10 +03:00
Roman Lebedev	5ec2b757cc	[Instruction] Speculatively undo isIdenticalToWhenDefined() PHI handling changes The stage2-stage3 differences persist even without instcombine-based PHI CSE, so this is the only possible reason.	2020-08-29 19:38:57 +03:00
Benjamin Kramer	8782c72765	Strength-reduce SmallVectors to arrays. NFCI.	2020-08-28 21:14:20 +02:00
David Sherwood	f4257c5832	[SVE] Make ElementCount members private This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065	2020-08-28 14:43:53 +01:00
Florian Hahn	20e989e9de	[BuildLibCalls] Add argmemonly to more lib calls. strspn, strncmp, strcspn, strcasecmp, strncasecmp, memcmp, memchr, memrchr, memcpy, memmove, memcpy, mempcpy, strchr, strrchr, bcmp should all only access memory through their arguments. I broke out strcoll, strcasecmp, strncasecmp because the result depends on the locale, which might get accessed through memory. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86724	2020-08-28 09:50:38 +01:00
Florian Hahn	419c6948df	[SimplifyLibCalls] Remove over-eager early return in strlen optzns. Currently we bail out early for strlen calls with a GEP operand, if none of the GEP specific optimizations fire. But there could be later optimizations that still apply, which we currently miss out on. An example is that we do not apply the following optimization strlen(x) == 0 --> *x == 0 Unless I am missing something, there seems to be no reason for bailing out early there. Fixes PR47149. Reviewed By: lebedev.ri, xbolva00 Differential Revision: https://reviews.llvm.org/D85886	2020-08-27 15:19:45 +01:00
Sam Parker	8ce450da32	[NFCI][SimplifyCFG] Combine select costs and checks Combine the cost modelling and validity checks for the phi to select conversion in SpeculativelyExecuteBB, extracting the logic out into a function.	2020-08-24 09:16:11 +01:00
Amy Huang	5e3fd471ac	[Cloning] Fix to cloning DISubprograms. When trying to enable -debug-info-kind=constructor there was an assert that occurs during debug info cloning ("mismatched subprogram between llvm.dbg.value variable and !dbg attachment"). It appears that during llvm::CloneFunctionInto, a DISubprogram could be duplicated when MapMetadata is called, and then added to the MD map again when DIFinder gets a list of subprograms. This results in two different versions of the DISubprogram. This patch switches the order so that the DIFinder subprograms are added before MapMetadata is called. Fixes https://bugs.llvm.org/show_bug.cgi?id=46784 Differential Revision: https://reviews.llvm.org/D86185	2020-08-21 11:54:56 -07:00
Florian Hahn	8eded24bf4	Recommit "[SCEVExpander] Add helper to clean up instrs inserted while expanding." Recommit the patch after fixing an issue reported caused by the fact that re-used values are also added to InsertedValues. Additional tests have been added in `88818491b9` This reverts the revert commit `38884641f2`.	2020-08-21 15:04:17 +01:00
Sam Parker	bfc6d8b59b	[NFC][SimplifyCFG] Formatting and variable rename	2020-08-21 13:11:17 +01:00
Sam Parker	47251582f5	[SimplifyCFG] Cost required selects Before we speculatively execute a basic block, query the cost of inserting the necessary select instructions against the phi folding threshold. For non-trivial insertions, a more accurate decision can probably be made during machine if-conversion. With minsize we query the CodeSize cost, otherwise we use SizeAndLatency. Differential Revision: https://reviews.llvm.org/D82438	2020-08-21 09:52:52 +01:00
Dávid Bolvanský	f134fc4f1b	Reland "[SLC] sprintf(dst, "%s", str) -> strcpy(dst, str)"	2020-08-15 12:14:57 +02:00
Martin Storsjö	3e7403a134	Revert "[SLC] sprintf(dst, "%s", str) -> strcpy(dst, str)" This reverts commit `6dbf0cfcf7`. That commit caused failed assertions, e.g. like this: $ cat sprintf-strcpy.c char ptr; void func(void) { ptr += sprintf(ptr, "%s", ""); } $ clang -c sprintf-strcpy.c -O2 -target x86_64-linux-gnu clang: ../lib/IR/Value.cpp:473: void llvm::Value::doRAUW(llvm::Value, llvm::Value::ReplaceMetadataUses): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.	2020-08-15 09:35:11 +03:00
Dávid Bolvanský	f62de7c9c7	[SLC] Transform strncpy(dst, "text", C) to memcpy(dst, "text\0\0\0", C) for C <= 128 only Transformation creates big strings for big C values, so bail out for C > 128. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86004	2020-08-15 01:53:32 +02:00
Jordan Rupprecht	38884641f2	Temporarily revert "[SCEVExpander] Add helper to clean up instrs inserted while expanding." This reverts commit `7829c33084`. The assertion is triggering on some internal code. A reduced test case is in progress.	2020-08-14 14:52:37 -07:00
Dávid Bolvanský	6dbf0cfcf7	[SLC] sprintf(dst, "%s", str) -> strcpy(dst, str) Transform sprintf(dst, "%s", str) -> strcpy(dst, str) if result is unused Avoid sprintf(dest, "%s", str) -> llvm.memcpy(align 1 dest, align 1 str, strlen(str)+1) if optimizing for size. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85963	2020-08-14 23:48:53 +02:00
Arthur Eubanks	48cd5b72b1	Revert "[SLC] sprintf(dst, "%s", str) -> strcpy(dst, str)" This reverts commit `ab9fc8bae8`. Incorrect transformation if the result is used. Causes breakages, e.g. http://green.lab.llvm.org/green/job/test-suite-verify-machineinstrs-x86_64-O3/8193/	2020-08-13 21:05:03 -07:00
Dávid Bolvanský	ab9fc8bae8	[SLC] sprintf(dst, "%s", str) -> strcpy(dst, str) Solves 46489	2020-08-14 00:05:55 +02:00
Dávid Bolvanský	5ef2287d36	[SLC] Optimize strncpy(a, a, C) to memcpy(a, a000, C) Solves PR47154	2020-08-13 22:22:51 +02:00
David Stenberg	e8ebebb0bd	[InstCombine] Fix incorrect Modified status When removing instructions from unreachable blocks, and only debug info intrinsics were removed, InstCombine could incorrectly return a false Modified status. This is fixed by making removeAllNonTerminatorAndEHPadInstructions() also return how many debug info intrinsics that were removed, and take that into account. This was caught using the check introduced by D80916. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D85839	2020-08-13 15:10:41 +02:00
Whitney Tsang	aa994d9867	[NFC][LoopUnrollAndJam] Use BasicBlock::replacePhiUsesWith instead of static function updatePHIBlocks. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D85673	2020-08-11 15:35:14 +00:00
Florian Hahn	7829c33084	[SCEVExpander] Add helper to clean up instrs inserted while expanding. SCEVExpander already tracks which instructions have been inserted n InsertedValues/InsertedPostIncValues. This patch adds an additional vector to collect the instructions in insertion order. This can then be used to remove exactly the instructions inserted by the expander. This replaces ExpandedValuesCleaner, which in some cases might remove values not inserted by the expander (e.g. if a value was dead before insertion and is then used during expansion). Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D84327	2020-08-11 09:30:31 +01:00
Juneyoung Lee	ef018cb65c	[BuildLibCalls] Add noundef to standard I/O functions This patch adds noundef to return value and arguments of standard I/O functions. With this patch, passing undef or poison to the functions becomes undefined behavior in LLVM IR. Since undef/poison is lowered from operations having UB in C/C++, passing undef to them was already UB in source. With this patch, the functions cannot return undef or poison anymore as well. According to C17 standard, ungetc/ungetwc/fgetpos/ftell can generate unspecified value; 3.19.3 says unspecified value is a valid value of the relevant type, and using unspecified value is unspecified behavior, which is not UB, so it cannot be undef (using undef is UB when e.g. it is used at branch condition). — The value of the file position indicator after a successful call to the ungetc function for a text stream, or the ungetwc function for any stream, until all pushed-back characters are read or discarded (7.21.7.10, 7.29.3.10). — The details of the value stored by the fgetpos function (7.21.9.1). — The details of the value returned by the ftell function for a text stream (7.21.9.4). In the long run, most of the functions listed in BuildLibCalls should have noundefs; to remove redundant diffs which will anyway disappear in the future, I added noundef to a few more non-I/O functions as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D85345	2020-08-10 10:58:25 +09:00
Florian Hahn	23817cbd0b	[SCEVExpander] Make sure cast properly dominates Builder's IP. The selected cast must properly dominate the Builder's IP, so we cannot re-use the cast, if it matches the builder's IP.	2020-08-09 16:51:19 +01:00
Florian Hahn	c70f0b9d4a	[SCEVExpander] Avoid re-using existing casts if it means updating users. Currently the SCEVExpander tries to re-use existing casts, even if they are not exactly at the insertion point it was asked to create the cast. To do so in some case, it creates a new cast at the insertion point and updates all users to use the new cast. This behavior is problematic, because it changes the IR outside of the instructions created during the expansion. Therefore we cannot completely undo all changes made during expansion. This re-use should be only an extra optimization, so only using the new cast in the expanded instructions should not be a correctness issue. There are many cases equivalent instructions are created during expansion. This patch also adjusts findInsertPointAfter to skip instructions inserted during expansion. This enables re-using existing casts without the renaming any uses, by picking a better insertion point. Reviewed By: efriedma, lebedev.ri Differential Revision: https://reviews.llvm.org/D84399	2020-08-09 13:25:17 +01:00
Roman Lebedev	e492f0e03b	[SimplifyCFG] Fix invoke->call fold w/ multiple invokes in presence of lifetime intrinsics SimplifyCFG has two main folds for resumes - one when resume is directly using the landingpad, and the other one where resume is using a PHI node. While for the first case, we were already correctly ignoring all the PHI nodes, and both the debug info intrinsics and lifetime intrinsics, in the PHI-based-one, we weren't ignoring PHI's in the resume block, and weren't ignoring lifetime intrinsics. That is clearly a bug. On RawSpeed library, this results in +9.34% (+81) more invoke->call folds, -0.19% (-39) landing pads, -0.24% (-81) invoke instructions but +51 call instructions and -132 basic blocks. Though, the run-time performance impact appears to be within the noise.	2020-08-08 20:00:28 +03:00
Roman Lebedev	1f452ac1d7	[NFC][SimplifyCFG] Rewrite isCleanupBlockEmpty() to be iterator_range-based	2020-08-08 20:00:28 +03:00

... 4 5 6 7 8 ...

5640 Commits