llvm-project

Commit Graph

Author	SHA1	Message	Date
Arnold Schwaighofer	2df2b27d94	[cora async] Cleanup undefined llvm.coro.async.resume In situations where the coroutine function is not split we can just replace the async.resume by null. rdar://82591919 Differential Revision: https://reviews.llvm.org/D110191	2021-09-30 13:26:53 -07:00
Florian Hahn	1fbdbb5595	Revert "Recommit "[SCEV] Look through single value PHIs." (take 2)" This reverts commit `764d9aa979`. This patch exposed a few additional cases where SCEV expressions are not properly invalidated. See PR52024, PR52023.	2021-09-30 20:53:51 +01:00
Sanjay Patel	66c069d7d6	[InstCombine] add tests for shift-trunc-shift; NFC	2021-09-30 15:06:13 -04:00
Craig Topper	765348298c	[CostModel] Update default cost model for sadd/ssub overflow to match TargetLowering The expansion for these was updated in https://reviews.llvm.org/D47927 but the cost model was not adjusted. I believe the cost model was also incorrect for the old expansion. The expansion prior to D47927 used 3 icmps using LHS, RHS, and Result to calculate theirs signs. Then 2 icmps to compare the signs. Followed by an And. The previous cost model was using 3 icmps and 2 selects. Digging back through git blame, those 2 selects in the cost model used to be 2 icmps, but were changed in https://reviews.llvm.org/D90681 Differential Revision: https://reviews.llvm.org/D110739	2021-09-30 09:41:14 -07:00
Adrian Prantl	9232ca4712	Improve the effectiveness of BDCE's debug info salvaging This patch improves the effectiveness of BDCE's debug info salvaging by processing the instructions in reverse order and delaying dropAllReferences until after debug info salvaging. This allows salvaging of entire chains of deleted instructions! Previously we would remove all references from an instruction, which would make it impossible to use that instruction to salvage a later instruction in the instruction stream, because its operands were already removed. This reapplies the previous patch with a fix for a use-after-free. Differential Revision: https://reviews.llvm.org/D110568	2021-09-30 09:28:49 -07:00
Anna Thomas	452714f8f8	[BPI] Keep BPI available in loop passes through LoopStandardAnalysisResults This is analogous to D86156 (which preserves "lossy" BFI in loop passes). Lossy means that the analysis preserved may not be up to date with regards to new blocks that are added in loop passes, but BPI will not contain stale pointers to basic blocks that are deleted by the loop passes. This is achieved through BasicBlockCallbackVH in BPI, which calls eraseBlock that updates the data structures in BPI whenever a basic block is deleted. This patch does not have any changes in the upstream pipeline, since none of the loop passes in the pipeline use BPI currently. However, since BPI wasn't previously preserved in loop passes, the loop predication pass was invoking BPI on the entire function every time it ran in an LPM. This caused massive compile time in our downstream LPM invocation which contained loop predication. See updated test with an invocation of a loop-pipeline containing loop predication and -debug-pass turned ON. Reviewed-By: asbirlea, modimo Differential Revision: https://reviews.llvm.org/D110438	2021-09-30 10:27:05 -04:00
Bjorn Pettersson	3f8027fb67	[test] Update some test cases to use -passes when specifying the pipeline This updates transform test cases for ADCE AddDiscriminators AggressiveInstCombine AlignmentFromAssumptions ArgumentPromotion BDCE CalledValuePropagation DCE Reg2Mem WholeProgramDevirt to use the -passes syntax when specifying the pipeline. Given that LLVM_ENABLE_NEW_PASS_MANAGER isn't set to off (which is a deprecated feature) the updated test cases already used the new pass manager, but they were using the legacy syntax when specifying the passes to run. This patch can be seen as a step toward deprecating that interface. This patch also removes some redundant RUN lines. Here I am referring to test cases that had multiple RUN lines verifying both the legacy "-passname" syntax and the new "-passes=passname" syntax. Since we switched the default pass manager to "new PM" both RUN lines have verified the new PM version of the pass (more or less wasting time running the same test twice), unless LLVM_ENABLE_NEW_PASS_MANAGER is set to "off". It is assumed that it is enough to run these tests with the new pass manager now. Differential Revision: https://reviews.llvm.org/D108472	2021-09-29 21:51:08 +02:00
Sjoerd Meijer	367df18050	[LoopFlatten] Bail if we can't perform flattening after IV widening It can happen that after widening of the IV, flattening may not be possible, e.g. when it is deemed unprofitable. We were not properly checking this, which resulted in flattening being applied when it shouldn't, also leading to incorrect results (miscompilation). This should fix PR51980 (https://bugs.llvm.org/show_bug.cgi?id=51980) Differential Revision: https://reviews.llvm.org/D110712	2021-09-29 19:53:34 +01:00
Sanjay Patel	4414e2ad97	[InstSimplify] (-1 << x) s>> x --> -1 This was noticed in: https://llvm.org/PR51351 https://alive2.llvm.org/ce/z/aLxunD	2021-09-29 13:03:12 -04:00
Sanjay Patel	ea56dcb730	[InstCombine] fix miscompile from dropRedundantMaskingOfLeftShiftInput() The test is from https://llvm.org/PR51351. There are 2 related logic bugs from over-generalizing "lshr" to "any shr", but I'm not sure how to expose the difference for "MaskC" because instsimplify already folds ashr of -1. I'll extend instsimplify to catch the MaskD pattern as a follow-up, but this patch should be enough to avoid the miscompile.	2021-09-29 11:43:18 -04:00
Sanjay Patel	d3e2067c7c	[InstSimplify] add tests for (-1 << x) s>> x; NFC	2021-09-29 11:43:18 -04:00
Sanjay Patel	ac4f30ac49	[InstCombine] add test for miscompile in dropRedundantMaskingOfLeftShiftInput(); NFC (PR51351)	2021-09-29 11:43:18 -04:00
Simon Pilgrim	17f1fc1e54	[TTI] BasicTTI::getInterleavedMemoryOpCost(): use getScalarizationOverhead() getScalarizationOverhead() results in a somewhat better cost estimation than counting the insertion/extraction costs directly. Notably, this is still overestimating the costs. Original Patch by: @lebedev.ri (Roman Lebedev) Differential Revision: https://reviews.llvm.org/D110713	2021-09-29 16:41:53 +01:00
Florian Hahn	0b4a4cc72d	[IndVarSimplify] Forget phi value after changing incoming value. This fixes an issue exposed by D71539, where IndVarSimplify tries to access an invalid cached SCEV expression after making changes to the underlying PHI instruction earlier. When changing the incoming value of a PHI, forget the cached SCEV for the PHI.	2021-09-29 14:44:13 +01:00
Sanjay Patel	98fde3489a	[InstCombine] reduce redundant code for shl-binop folds This is NFCI (no-functional-change-intended), but there are benign diffs possible with commutable ops as seen in the test diffs. The transforms were repeated for the commutative opcodes, but that should not be necessary if we canonicalize the patterns that we're matching. If both operands of the binop match, that should get folded eventually. The transform that starts with a mask op seems to over-constrain the use checks, so that could be a potential enhancement.	2021-09-28 17:06:45 -04:00
Sanjay Patel	6c1a58fe51	[InstCombine] add multi-use tests for shl folds; NFC	2021-09-28 17:06:45 -04:00
Nikita Popov	abbbc480a1	Revert "Improve the effectiveness of BDCE's debug info salvaging" This reverts commit `f6954bf804`. This breaks the test-suite O3 build: /home/nikic/llvm-test-suite/build-O3/tools/timeit --summary Bitcode/Benchmarks/Halide/local_laplacian/CMakeFiles/halide_local_laplacian.dir/local_laplacian.bc.o.time /home/nikic/llvm-project/build/bin/clang++ -DNDEBUG -O3 -w -Werror=date-time -save-stats=obj -save-stats=obj -std=c++11 -MD -MT Bitcode/Benchmarks/Halide/local_laplacian/CMakeFiles/halide_local_laplacian.dir/local_laplacian.bc.o -MF Bitcode/Benchmarks/Halide/local_laplacian/CMakeFiles/halide_local_laplacian.dir/local_laplacian.bc.o.d -o Bitcode/Benchmarks/Halide/local_laplacian/CMakeFiles/halide_local_laplacian.dir/local_laplacian.bc.o -c ../Bitcode/Benchmarks/Halide/local_laplacian/local_laplacian.bc While deleting: i64 % Use still stuck around after Def is destroyed: %12620 = mul i64 %12619, <badref> clang++: /home/nikic/llvm-project/llvm/lib/IR/Value.cpp:103: llvm::Value::~Value(): Assertion `materialized_use_empty() && "Uses remain when a value is destroyed!"' failed.	2021-09-28 21:52:27 +02:00
Anna Thomas	03ce0841da	Add profile count. Regenerate check lines. NFC Function profile counts added to test cases. Regenerated test lines for loop predication test.	2021-09-28 15:33:49 -04:00
Sanjay Patel	09c575e728	[InstCombine] add/move tests for shl with binop; NFC	2021-09-28 14:46:27 -04:00
Adrian Prantl	f6954bf804	Improve the effectiveness of BDCE's debug info salvaging This patch improves the effectiveness of BDCE's debug info salvaging by processing the instructions in reverse order and delaying dropAllReferences until after debug info salvaging. This allows salvaging of entire chains of deleted instructions! Previously we would remove all references from an instruction, which would make it impossible to use that instruction to salvage a later instruction in the instruction stream, because its operands were already removed. Differential Revision: https://reviews.llvm.org/D110568	2021-09-28 10:24:51 -07:00
Adrian Prantl	9637b045e6	Improve the effectiveness of ADCE's debug info salvaging This patch improves the effectiveness of ADCE's debug info salvaging by processing the instructions in reverse order and delaying dropAllReferences until after debug info salvaging. This allows salvaging of entire chains of deleted instructions! Previously we would remove all references from an instruction, which would make it impossible to use that instruction to salvage a later instruction in the instruction stream, because its operands were already removed. Differential Revision: https://reviews.llvm.org/D110462	2021-09-28 10:24:50 -07:00
Adrian Prantl	1b998a5f0c	Add salvageDebugInfo support for truncating/extending ptr/int conversions. This patch enables debug info salvaging for truncating/extending ptr int conversions. The testcase uncovered a bug in adce, which is addressed separately. rdar://80227769 Differential Revision: https://reviews.llvm.org/D110461	2021-09-28 10:24:50 -07:00
Paul Robinson	56e681afcc	[TargetLibraryInfo] Pick new/delete calls by target There are two sets of new/delete functions, one with Windows/MSVC mangling and one with Itanium mangling. Mark one set or the other as unavailable depending on the target. Split the test malloc-free-delete.ll into three parts: malloc-free.dll for the C API tests, new-delete-itanium.ll and new-delete-msvc.ll for the target-specific new/delete tests. Differential Revision: https://reviews.llvm.org/D110419	2021-09-28 10:10:25 -07:00
Alex Richardson	ebb3dc0833	[InstCombine] Fold ptrtoint(gep i8 null, x) -> x This commit is the InstCombine follow-up to the previous constant-folding change that enables noticeable optimizations for CHERI-enabled targets. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D110247	2021-09-28 17:57:37 +01:00
Alex Richardson	9049a1c61e	[ConstantFolding] Fold ptrtoint(gep i8 null, x) -> x I was looking at some missed optimizations in CHERI-enabled targets and noticed that we weren't removing vtable indirection for calls via known pointers-to-members. The underlying reason for this is that we represent pointers-to-function-members as {i8 addrspace(200)*, i64} and generate the constant offsets using (gep i8 null, <index>). We use a constant GEP here since inttoptr should be avoided for CHERI capabilities. The pointer-to-member call uses ptrtoint to extract the index, and due to this missing fold we can't infer the actual value loaded from the vtable. This is the initial constant folding change for this pattern, I will add an InstCombine fold as a follow-up. We could fold all inbounds GEP to null (and therefore the ptrtoint to zero) since zero is the only valid offset for an inbounds GEP. If the offset is not zero, that GEP is poison and therefore returning 0 is valid (https://alive2.llvm.org/ce/z/Gzb5iH). However, Clang currently generates inbounds GEPs on NULL for hand-written offsetof() expressions, so this could lead to miscompilations. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D110245	2021-09-28 17:57:36 +01:00
Alex Richardson	fc0051011e	[InstCombine][ConstantFold] Baseline tests for ptrtoint(gep null, x) Differential Revision: https://reviews.llvm.org/D110244	2021-09-28 17:57:36 +01:00
Alexey Bataev	f701505c45	[SLP]Improve vectorization of phi nodes by trying wider vectors. Try to improve vectorization of the PHI nodes by trying to vectorize similar instructions at the size of the widest possible vectors, then aggregating with compatible type PHIs and trying to vectoriza again and only if this failed, try smaller sizes of the vector factors for compatible PHI nodes. This restores performance of several benchmarks after tuning of the fp/int conversion instructions costs. Differential Revision: https://reviews.llvm.org/D108740	2021-09-28 07:20:36 -07:00
Sjoerd Meijer	0ea77502e2	[LoopFlatten] Updating Phi nodes after IV widening In rG6a076fa9539e, a problem with updating the old/narrow phi nodes after IV widening was introduced. If after widening of the IV the transformation is not applied, the narrow phi node was incorrectly modified, which should only happen if flattening happens. This can be seen in the added test widen-iv2.ll, which incorrectly had 1 incoming value, but should have its original 2 incoming values, which is now restored. Differential Revision: https://reviews.llvm.org/D110234	2021-09-28 15:09:20 +01:00
Sanjay Patel	1a1aed8da8	[InstCombine] add tests for icmp-gep; NFC We need more coverage for commuted and (un)signed preds to verify that things behave as expected here. Currently, we do not transform signed preds or non-inbounds geps.	2021-09-28 10:00:35 -04:00
Sanjay Patel	72d991c42e	[InstCombine] add/move tests for icmp with gep operand(s); NFC	2021-09-28 09:40:52 -04:00
Alexey Bataev	8bacfb9bed	[SLP]No need to schedule/check parent for extract{element/value} instruction. The instruction extractelement/extractvalue are not required to be scheduled since they only depend on the source vector/aggregate (with constant indices), smae applies to the parent basic block checks. Improves compile time and saves scheduling budget. Differential Revision: https://reviews.llvm.org/D108703	2021-09-28 06:13:55 -07:00
Florian Hahn	e2f6290e06	[VectorCombine] Discard ScalarizationResult state in early exit. ScalarizationResult's destructor makes sure ToFreeze is not ignored if set. Currently, scalarizeLoadExtract has an early exit if the index is not safe directly. But when it is SafeWithFreeze, we need to discard the state first, otherwise we hit the assert in the destructor. Fixes PR51992.	2021-09-28 12:52:16 +01:00
Florian Hahn	764d9aa979	Recommit "[SCEV] Look through single value PHIs." (take 2) This reverts commit `8fdac7cb7a`. The issue causing the revert has been fixed a while ago in `60b852092c`. Original message: Now that SCEVExpander can preserve LCSSA form, we do not have to worry about LCSSA form when trying to look through PHIs. SCEVExpander will take care of inserting LCSSA PHI nodes as required. This increases precision of the analysis in some cases. Reviewed By: mkazantsev, bmahjour Differential Revision: https://reviews.llvm.org/D71539	2021-09-28 10:32:17 +01:00
Anna Thomas	90fb73aa73	[LoopPred Test] Fix lld-x86_64-win BB failure Need a more general CHECK line for testcase in `5df9112` for correctly handling lld-x86_64-win buildbot.	2021-09-27 21:28:46 -04:00
Anna Thomas	5df9112ce3	Reland "[LoopPredication] Add testcase showing BPI computation. NFC" This relands commit `16a62d4f`. Relanded after fixing CHECK-LINES for opt pipeline output to be more general (based on failures seen in buildbot).	2021-09-27 21:15:46 -04:00
Anna Thomas	a0a9e3e05f	Revert "[LoopPredication] Add testcase showing BPI computation. NFC" This reverts commit `16a62d4f3d`. Needs some update to check lines to fix bb failure.	2021-09-27 17:08:57 -04:00
Anna Thomas	16a62d4f3d	[LoopPredication] Add testcase showing BPI computation. NFC Precommit testcase for D110438. Since we do not preserve BPI in loop pass manager, we are forced to compute BPI everytime Loop predication is invoked. The patch referenced changes that behaviour by preserving lossy BPI for loop passes.	2021-09-27 16:54:22 -04:00
Sanjay Patel	b75ed244af	[InstCombine] add tests for shl-of-sub; NFC	2021-09-27 14:56:01 -04:00
Sanjay Patel	623f93ed1c	[InstCombine] add use check to shl transform This bug was introduced with the refactoring in: `9075edc89b` ...but there were no tests to detect it.	2021-09-27 14:10:26 -04:00
Sanjay Patel	d992950078	[InstCombine] add tests for opposing shifts separated by trunc; NFC	2021-09-27 14:10:26 -04:00
Jameson Nash	e27a6db529	Bad SLPVectorization shufflevector replacement, resulting in write to wrong memory location We see that it might otherwise do: %10 = getelementptr {}, <2 x {}> %9, <2 x i32> <i32 10, i32 4> %11 = bitcast <2 x {}*> %10 to <2 x i64> ... %27 = extractelement <2 x i64> %11, i32 0 %28 = bitcast i64 %27 to <2 x i64>* store <2 x i64> %22, <2 x i64>* %28, align 4, !tbaa !2 Which is an out-of-bounds store (the extractelement got offset 10 instead of offset 4 as intended). With the fix, we correctly generate extractelement for i32 1 and generate correct code. Differential Revision: https://reviews.llvm.org/D106613	2021-09-27 14:06:13 -04:00
Sanjay Patel	21429cf43a	[InstCombine] generalize fold for (trunc (X u>> C1)) u>> C This is another step towards trying to re-apply D110170 by eliminating conflicting transforms that cause infinite loops. `a47c8e40c7` was a previous patch in this direction. The diffs here are mostly cosmetic, but intentional: 1. The existing code that would handle this pattern in FoldShiftByConstant() is limited to 'shl' only now. The formatting change to IsLeftShift shows that we could move several transforms into visitShl() directly for efficiency because they are not common shift transforms. 2. The tests are regenerated to show new instruction names to prove that we are getting (almost) identical logic results. 3. The one case where we differ ("trunc_sandwich_small_shift1") shows that we now use a narrow 'and' instruction. Previously, we relied on another transform to do that, but it is limited to legal types. That seems to be a legacy constraint from when IR analysis and codegen were less robust. https://alive2.llvm.org/ce/z/JxyGA4 declare void @llvm.assume(i1) define i8 @src(i32 %x, i32 %c0, i8 %c1) { ; The sum of the shifts must not overflow the source width. %z1 = zext i8 %c1 to i32 %sum = add i32 %c0, %z1 %ov = icmp ult i32 %sum, 32 call void @llvm.assume(i1 %ov) %sh1 = lshr i32 %x, %c0 %tr = trunc i32 %sh1 to i8 %sh2 = lshr i8 %tr, %c1 ret i8 %sh2 } define i8 @tgt(i32 %x, i32 %c0, i8 %c1) { %z1 = zext i8 %c1 to i32 %sum = add i32 %c0, %z1 %maskc = lshr i8 -1, %c1 %s = lshr i32 %x, %sum %t = trunc i32 %s to i8 %a = and i8 %t, %maskc ret i8 %a }	2021-09-27 10:57:31 -04:00
Sjoerd Meijer	eba76056a3	[FuncSpec] Don't specialise (or crash) on poison or constexpr values Function specialization was crashing on poison values and constexpr values. The problem is that these values are not added to the solver, so it crashes when a lookup is performed for these values. This fixes that by not specialising on these values. For poison that is obvious, but for constexpr this is a change in behaviour. Thus, in one way this is a bit of a stopgap, but specialising on constexpr values wasn't done very intentionally, and need some more work and tests if we wanted to support this. As a follow up, we need to look if the solver should exit more gracefully and return a "don't know", or that it should really support these constexprs. This should fix PR51600 (https://bugs.llvm.org/show_bug.cgi?id=51600). Differential Revision: https://reviews.llvm.org/D110529	2021-09-27 14:58:53 +01:00
Sjoerd Meijer	a588ae482b	[LoopFlatten] Precommit new test widen-iv2.ll for D110234.	2021-09-27 14:37:44 +01:00
Jun Ma	3a998c06a8	Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when switch cover all possible values.""" This reverts commit `8ba2adcf9e`.	2021-09-27 20:39:05 +08:00
Florian Hahn	4b581e87df	[LV] Add tests where rt checks may make vectorization unprofitable. Add a few additional tests which require a large number of runtime checks for D109368.	2021-09-27 10:32:28 +01:00
Max Kazantsev	e787678cef	[Test] Add some simple tests where IndVars cannot remove a check in loop Previously I've added tests that require context for inference, but it seems tha SCEV can't prove same facts even when the context isn't required.	2021-09-27 12:12:51 +07:00
Sanjay Patel	6063e6b499	[InstCombine] move add after min/max intrinsic This is another regression noted with the proposal to canonicalize to the min/max intrinsics in D98152. Here are Alive2 attempts to show correctness without specifying exact constants: https://alive2.llvm.org/ce/z/bvfCwh (smax) https://alive2.llvm.org/ce/z/of7eqy (smin) https://alive2.llvm.org/ce/z/2Xtxoh (umax) https://alive2.llvm.org/ce/z/Rm4Ad8 (umin) (if you comment out the assume and/or no-wrap, you should see failures) The different output for the umin test is due to a fold added with `c4fc2cb5b2` : // umin(x, 1) == zext(x != 0) We probably want to adjust that, so it applies more generally (umax --> sext or patterns where we can fold to select-of-constants). Some folds that were ok when starting with cmp+select may increase instruction count for the equivalent intrinsic, so we have to decide if it's worth altering a min/max. Differential Revision: https://reviews.llvm.org/D110038	2021-09-26 09:49:10 -04:00
Nikita Popov	ba664d9066	[AA] Move earliest escape tracking from DSE to AA This is a followup to D109844 (and alternative to D109907), which integrates the new "earliest escape" tracking into AliasAnalysis. This is done by replacing the pre-existing context-free capture cache in AAQueryInfo with a replaceable (virtual) object with two implementations: The SimpleCaptureInfo implements the previous behavior (check whether object is captured at all), while EarliestEscapeInfo implements the new behavior from DSE. This combines the "earliest escape" analysis with the full power of BasicAA: It subsumes the call handling from D109907, considers a wider range of escape sources, and works with AA recursion. The compile-time cost is slightly higher than with D109907. Differential Revision: https://reviews.llvm.org/D110368	2021-09-25 22:40:41 +02:00
Nikita Popov	327bbbb10b	[DSE] Make capture check more precise It is sufficient that the object has not been captured before the load that produces the pointer we're loading. A capture after that can not affect the already loaded pointer. This is small part of D110368 applied separately.	2021-09-25 22:23:19 +02:00

1 2 3 4 5 ...

19812 Commits