llvm-project

Commit Graph

Author	SHA1	Message	Date
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
Kazu Hirata	d66cbc565a	Don't use Optional::hasValue (NFC)	2022-06-20 20:26:05 -07:00
Kazu Hirata	0916d96d12	Don't use Optional::hasValue (NFC)	2022-06-20 20:17:57 -07:00
Florian Hahn	6dd772d348	[ConstraintElimination] Move logic to get a constraint to helper (NFC).	2022-06-20 21:34:07 +02:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
Florian Hahn	cebe7ae881	[ConstraintElimination] Move logic to add constraint to helper (NFC).	2022-06-20 17:08:35 +02:00
Florian Hahn	bd9632afd2	[ConstraintElimination] Move StackEntry up, to allow use earlier (NFC).	2022-06-20 16:40:42 +02:00
Kazu Hirata	129b531c9c	[llvm] Use value_or instead of getValueOr (NFC)	2022-06-18 23:07:11 -07:00
Kazu Hirata	b254d67160	[llvm] Call *set::insert without checking membership first (NFC)	2022-06-18 08:32:54 -07:00
Florian Hahn	7c0089d735	[Matrix] Check if iterator is at beginning of BB in optimizeTranspose. If an instruction at the beginning of a block is erased, this may trigger crash due to dereferencing an invalid iterator. Check if II is at the end before dereferencing it. Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D127736	2022-06-14 21:37:02 +01:00
Florian Hahn	782e912246	[ConstraintElimination] Support constraints with only const ops. Remove the early exit if both constraints contain no variables. This restriction is unnecessayr for correctness and removing it simplifies handling of trivial constant conditions in follow-up changes.	2022-06-14 10:37:12 +01:00
Guillaume Chatelet	f9bb8c24ac	[NFC][Alignment] Convert MemCpyOptimizer.cpp	2022-06-13 10:07:09 +00:00
Kazu Hirata	5d7b1a5f1b	[Scalar] Use llvm::append_range (NFC)	2022-06-10 23:09:01 -07:00
Guillaume Chatelet	dc9c2eac98	[NFC][Alignment] Simplify code	2022-06-10 15:25:28 +00:00
Guillaume Chatelet	12ccdd67aa	[NFC] Use proper getSliceAlign type in SROA	2022-06-10 12:37:41 +00:00
Philip Reames	206f10d3f6	Plumb InstructionCost through unroll costing Teach the unroller(s) how to handle an invalid cost. This avoids crashes when the backend can't provide a cost due to either a fundemental limitation or an unimplemented cost model case. Differential Revision: https://reviews.llvm.org/D127305	2022-06-09 15:42:53 -07:00
Philip Reames	f85c5079b8	Pipe potentially invalid InstructionCost through CodeMetrics Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred. On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost. I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change. Differential Revision: https://reviews.llvm.org/D127131	2022-06-09 15:17:24 -07:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Philip Reames	89c4b29e8d	[GuardWidening] Fix a nasty cast bug in `c2eccc6` `c2eccc6` introduced a call to etHasNoUnsignedWrap which implicitly assumes that Inst is a OverflowingBinaryOperator. This is frequently untrue, but was not caught because cast<Ty>(X) has been broken, see https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033 for context. I considered reverting this, but since doing so re-introduces a nasty miscompile of its own, I decided to fix forward instead. I'll note that this is a particularly nasty form of the cast<Ty>(X) issue. Because the cast was succeeding unexpected, we were writing data to instructions which weren't OBOs. This could result in near arbitrary data or memory corruption. I'm a bit shocked that the sanitizers didn't find this TBH.	2022-06-07 13:27:13 -07:00
Craig Topper	d73684e223	[LoopFlatten] Fix crash if the inner loop trip count comes from a sext instruction. If we look through a truncate in matchLinearIVUser, it's possible we find a sext/zext instruction that didn't come from widening. This will fail the MatchedItCount->getType() == InnerInductionPHI->getType() assertion. Fix this by checking that we did not look through a truncate already. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D127149	2022-06-07 08:21:21 -07:00
Craig Topper	fdd5843572	[LoopFlatten] Replace unchecked dyn_cast with cast. Spotted while reading through the code. Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D127146	2022-06-07 08:21:00 -07:00
Kevin P. Neal	a1f1bd547b	[IPSCCP] Switch away from Instruction::isSafeToRemove() In D115737 I found that I needed to teach Instruction::isSafeToRemove() about strictfp/constrained intrinsics. It was pointed out that this is probably the wrong function to use isInstructionTriviallyDead(). It doesn't make sense to have a "second, worse implementation". I also believe that the Instruction class is the wrong place for this functionality. The information about whether or not an instruction can be removed is in the transform passes and should stay there. Differential Revision: https://reviews.llvm.org/D118387	2022-06-06 09:24:11 -04:00
Kazu Hirata	8daf23d364	[Scalar] Use llvm::make_early_inc_range (NFC)	2022-06-05 23:53:18 -07:00
Kazu Hirata	30f19382c6	[Scalar] Remove isValidSingle (NFC) The last use was removed on Feb 18, 2022 in commit `00ab91b70d`.	2022-06-05 08:45:11 -07:00
Fangrui Song	95a134254a	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 01:07:51 -07:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Kazu Hirata	e0039b8d6a	Use llvm::less_second (NFC)	2022-06-04 22:48:32 -07:00
Kazu Hirata	f83a88a179	[Transforms] Use llvm::is_contained (NFC)	2022-06-04 20:48:26 -07:00
Fangrui Song	36c7d79dc4	Remove unneeded cl::ZeroOrMore for cl::opt options Similar to `557efc9a8b`. This commit handles options where cl::ZeroOrMore is more than one line below cl::opt.	2022-06-04 00:10:42 -07:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00
Daniil Suchkov	f1940a5895	Revert "[LoopInterchange] New cost model for loop interchange" Reverting the commit due to numerous buildbot failures. This reverts commit `006334470d`.	2022-06-03 00:52:08 +00:00
Congzhe Cao	006334470d	[LoopInterchange] New cost model for loop interchange This patch proposed to use a new cost model for loop interchange, which is obtained from loop cache analysis. Given a loopnest, what loop cache analysis returns is a vector of loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost loop, loop1 should be placed one more level inside, and loop2 one more level inside, etc. What loop cache analysis does is not only more comprehensive than the current cost model, it is also a "one-shot" query which means that we only need to query it once during the entire loop interchange pass, which is better than the current cost model where we query it every time we check whether it is profitable to interchange two loops. Thus complexity is reduced, especially after D120386 where we do more interchanges to get the globally optimal loop access pattern. Updates made to test cases are mostly minor changes and some corrections. Test coverage for loop interchange is not reduced. Currently we did not completely remove the legacy cost model, but keep it as fall-back in case the new cost model did not run successfully. This is because currently we have some limitations in delinearization, which sometimes makes loop cache analysis bail out. The longer term goal is to enhance delinearization and eventually remove the legacy cost model compeletely. Reviewed By: bmahjour, #loopoptwg Differential Revision: https://reviews.llvm.org/D124926	2022-06-02 19:07:14 -04:00
eopXD	6eab5cade7	[LSR] Early exit for RateFormula when it is already losing. NFC This patch does not effect any behavior of the current code. The codebase implicitly implies that `Cost::RateFormula` is only called when the `Cost` is not in losing status, or else there may be possible to trigger the assertion of `Cost::isValid`. The intention here is to prevent mis-use where future development allow `Cost` that is already loser to call `Cost::RateFormula` - Early exit when `Cost` is already losing. Reviewed By: Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D125670	2022-06-01 21:02:40 -07:00
Eli Friedman	abdf0da800	[LoopIdiom] Fix bailout for aliasing in memcpy transform. Commit `dd5991cc` modified the aliasing checks here to allow transforming a memcpy where the source and destination point into the same object. However, the change accidentally made the code skip the alias check for other operations in the loop. Instead of completely skipping the alias check, just skip the check for whether the memcpy aliases itself. Differential Revision: https://reviews.llvm.org/D126486	2022-05-31 17:24:23 -07:00
Nuno Lopes	80b3dcc045	[Support] Make report_fatal_error respect its GenCrashDiag argument so it doesn't generate a backtrace There are a few places where we use report_fatal_error when the input is broken. Currently, this function always crashes LLVM with an abort signal, which then triggers the backtrace printing code. I think this is excessive, as wrong input shouldn't give a link to LLVM's github issue URL and tell users to file a bug report. We shouldn't print a stack trace either. This patch changes report_fatal_error so it uses exit() rather than abort() when its argument GenCrashDiag=false. Reviewed by: nikic, MaskRay, RKSimon Differential Revision: https://reviews.llvm.org/D126550	2022-05-30 19:19:23 +01:00
Nikita Popov	1721ff1dfd	[GVN] Enable enable-split-backedge-in-load-pre option by default This option was added in D89854. It prevents GVN from performing load PRE in a loop, if doing so would require critical edge splitting on the backedge. From the review: > I know that GVN Load PRE negatively impacts peeling, > loop predication, so the passes expecting that latch has > a conditional branch. In the PhaseOrdering test in this patch, splitting the backedge negatively affects vectorization: After critical edge splitting, the loop gets rotated, effectively peeling off the first loop iteration. The effect is that the first element is handled separately, then the bulk of the elements use a vectorized reduction (but using unaligned, off-by-one memory accesses) and then a tail of 15 elements is handled separately again. It's probably worth noting that the loop load PRE from D99926 is not affected by this change (as it does not need backedge splitting). This is about normal load PRE that happens to occur inside a loop. Differential Revision: https://reviews.llvm.org/D126382	2022-05-30 09:55:58 +02:00
Max Kazantsev	503d5771b6	[JumpThreading][NFCI] Reuse existing DT instead of recomputation This whole part with recomputation of BPI and BFI looks redundant, and we tried to get rid of it in D124439. Unfortunately, it causes some hard-to-reproduce failures due to invalid state of analysis. Until this is investigated and fixed, let's try to reuse at least part of available analyzes. DT is available at this point, and there is no need to recompute it. Please revert if you see it causing any behavior changes.	2022-05-30 12:48:10 +07:00
Florian Hahn	0776c48f9b	Recommit "[LICM] Only create load in ph when promoting load or store doesn't exec." This reverts the revert commit `ad95255b92`. The updated version also creates a load when the store may not execute. In those cases, we still need to introduce a load in a function where there may not have been one before, so this doesn't completely resolve issue #51248. Original message: When only a store is sunk, there is no need to create a load in the pre-header, as the result of the load will never get used. The dead load can can introduce UB, if the function is marked as writeonly. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123473	2022-05-29 21:57:14 +01:00
eopXD	6a84579243	[LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D126350	2022-05-27 15:22:23 -07:00
Arthur Eubanks	36096c2b38	[NFC][JumpThreading] Remove InsertFreezeWhenUnfoldingSelect pass parameter All callers pass true. select-unfold-freeze.ll is now a subset of select.ll so delete it. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126501	2022-05-26 16:13:34 -07:00
Owen Anderson	939a43461b	Revert "Replace the custom linked list in LeaderTableEntry with TinyPtrVector." This reverts commit `1e91149844`. Pending further discussion.	2022-05-26 09:50:36 -07:00
Alex Zhikhartsev	8b0d763474	[DFAJumpThreading] Relax analysis to handle unpredictable initial values Responding to a feature request from the Rust community: https://github.com/rust-lang/rust/issues/80630 void foo(X) { for (...) switch (X) case A X = B case B X = C } Even though the initial switch value is non-constant, the switch statement can still be threaded: the initial value will hit the switch statement but the rest of the state changes will proceed by jumping unconditionally. The early predictability check is relaxed to allow unpredictable values anywhere, but later, after the paths through the switch statement have been enumerated, no non-constant state values are allowed along the paths. Any state value not along a path will be an initial switch value, which can be safely ignored. Differential Revision: https://reviews.llvm.org/D124394	2022-05-26 11:29:54 -04:00
Florian Hahn	f96aa493f0	[SimpleLoopUnswitch] Always skip trivial select and set condition. When updating the branch instruction outside the loopduring non-trivial unswitching, always skip trivial selects and update the condition. Otherwise we might create invalid IR, because the trivial select is inside the loop, while the condition is outside the loop. Fixes #55697.	2022-05-26 09:46:24 +01:00
Owen Anderson	1e91149844	Replace the custom linked list in LeaderTableEntry with TinyPtrVector. The purpose of the custom linked list was to optimize for the case of a single-element list. It turns out that TinyPtrVector handles the same basic scenario even better, reducing the size of LeaderTableEntry by 33%, and requiring only log2(N) allocations as the size of the list grows. The only downside is that we have to store the Value's and BasicBlock's in separate vectors, which is slightly awkward in a few cases. Fortunately that ends up being entirely encapsulated inside helper functions. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D125205	2022-05-25 23:52:44 -07:00
Serguei Katkov	c2eccc67ce	[GuardWidening] Remove nuw/nsw flags for hoisted instructions When we hoist instructions over guard we must clear flags due to these flags might be implied using this guard, so they make sense only after the guard. As an example of the bug due to current behavior. L is known to be in range say [0, 100) c1 = x u< L guard (c1) x1 = add x, 1 c2 = x1 u< L guard(c2) basing on guard(c1) we can say that x1 = add nuw nsw x, 1 after guard widening we get c1 = x u< L x1 = add nuw nsw x, 1 c2 = x1 u< L c = and c1, c2 guard(c) now, basing on fact that x + 1 < L and x >= 0 due to x + 1 is nuw we can prove that x + 1 u< L implies that x u< L, so we can just remove c1 x1 = add nuw nsw x, 1 c2 = x1 u< L guard(c2) But that is not correct due to we will pass x == -1 value. Reviewed By: mkazantsev Subscribers: llvm-commits, nikic Differential Revision: https://reviews.llvm.org/D126354	2022-05-26 13:20:55 +07:00
Nikita Popov	6f0ca6fd23	[JumpThreading] Insert freeze when unfolding select JumpThreading may convert selects into branch instructions, in which case the condition needs to be frozen (as branch on poison is immediate undefined behavior, unlike select on poison). The necessary code for this is already in place, this just enables the option. Differential Revision: https://reviews.llvm.org/D125869	2022-05-21 11:24:27 +02:00
Florian Hahn	32d6ef36d6	[SimpleLoopUnswitch] Skip trivial selects during trivial unswitching. Update the remaining places in unswitchTrivialBranch to properly skip trivial selects. Fixes #55526.	2022-05-19 17:01:13 +01:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Nikita Popov	c9e7049754	[JumpThreading] Look through freeze in getPredicateAt() fold This code is valid for any icmp, so we can safely look through a freeze when trying to find one. A caveat here is that replaceFoldableUses() may not end up replacing any uses in this case. It might make sense to use the freeze as the context instruction (rather than the terminator) if there is a freeze, to ensure that it always gets folded. This would require some changes to how replaceFoldedUses() works though, as it currently assumes that the value is valid at the end of the block.	2022-05-18 12:09:59 +02:00
Nikita Popov	18c70a7bd9	[JumpThreading] Simplify getPredicateAt() based folding It's sufficient to just fold the icmp to true/false here, and then let constant terminator folding take care of the rest. It should be noted that while replaceFoldableUses() may not replace all uses of the icmp, at least the use in the terminator we're working on is always replaceable, so terminator constant folding should be reliably enabled as a subsequent step.	2022-05-18 11:24:52 +02:00
Nikita Popov	d4cdf013c7	[JumpThreading] Use common code to skip freeze (NFC) There are multiple places that want to look through freeze, so store condition without freeze in a separate variable.	2022-05-18 10:49:41 +02:00
Juneyoung Lee	3adcf96b4f	[JumpThreading] Let ProcessImpliedCondition look into freeze instructions This patch makes JumpThreading's ProcessImpliedCondition deal with frozen conditions. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D84941	2022-05-18 10:41:31 +09:00
Nikita Popov	9ba452b08e	[JumpThreading] Don't pass DT to isGuaranteedNotToBeUndefOrPoison() JumpThreading intentionally does not force updating of the DT during optimization, because this may be expensive when many CFG updates and DT calculations are interleaved. We shouldn't be fetching the DT just for the purpose of calling isGuaranteedNotToBeUndefOrPoison(), especially as DT availability doesn't even show benefit in tests.	2022-05-17 11:53:49 +02:00
Dmitry Vassiliev	7759680e2f	[SROA] Avoid postponing rewriting load/store by ignoring lifetime intrinsics in partition's promotability checking This patch fixes a bug that generates unnecessary packing/unpacking structure code because of incorrectly handling lifetime intrinsic. For example, a partition of an alloca may contain many slices: ``` Partition [0, 4): Slice0: [0, 4) used by: load i32 addr; Slice1: [0, 4) used by: store i32 v, addr; Slice2: [0, 16) used by lifetime.start(16, addr); ``` When SROA determines if the partition can be promoted, lifetime.start is currently treated as a whole alloca load/store, so Slice0 and Slice1 cannot be promoted at this attempt, but the packing/unpacking code for Slice0 and Slice1 has been generated. After rewrite lifetime.start/end intrinsic, SROA tries again with Slice0 and Slice1 and finally promotes them, but redundant packing/unpacking code remaining in the IRs. This patch changes promotability checking to ignore lifetime intrinsic (they will be rewritten to correct sizes later), so we can promote the real users (load/store) at the first attempt with optimal code. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D124967	2022-05-17 11:25:59 +02:00
Fangrui Song	60e5fd00cd	[RS4GC] Fix -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off build after D125000	2022-05-14 10:47:50 -07:00
Nikita Popov	ed1cb01baf	[IRBuilder] Add IsInBounds parameter to CreateGEP() We commonly want to create either an inbounds or non-inbounds GEP based on a boolean value, e.g. when preserving inbounds from existing GEPs. Directly accept such a boolean in the API, rather than requiring a ternary between CreateGEP and CreateInBoundsGEP. This change is not entirely NFC, because we now preserve an inbounds flag in a constant expression edge-case in InstCombine.	2022-05-13 14:30:55 +02:00
Florian Hahn	8e6d481f3b	[ConstraintElimination] Simplify ssub(A,B) if B s>=b && B s>=0. A first patch to use the reasoning in ConstraintElimination to simplify sub with overflow to a regular sub, if the operation is guaranteed to not overflow. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D125264	2022-05-13 13:19:41 +01:00
Dmitry Makogon	1da42c9f71	[RS4GC] Cache BDVs and bases alogn with IsKnownBase flag (NFC) This refactors RS4GC to cache results returned findBaseDefiningValue and also gets rid of BaseDefiningValueResult by caching the IsKnownBase flag for BDVs and bases. Differential Revision: https://reviews.llvm.org/D125000	2022-05-13 14:14:17 +07:00
Craig Topper	4b36d9bde7	[CVP] Preserve exact name when converting sext->zext and ashr->lshr. Previously we took the old name and always appended a numberic suffix. Since we're doing a 1:1 replacement, it's clearer to keep the original name exactly. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D125281	2022-05-10 09:13:59 -07:00
Craig Topper	7b362ddda9	[SCCP] Preserve Name when converting SExt->ZExt. This makes the output IR more readable since we're doing a one to one replacement. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D125280	2022-05-10 09:13:59 -07:00
Dawid Jurczak	009f6ce0ef	[GVNSink] Make GVNSink resistant against self referencing instructions (PR36954) Before this change GVNSink pass suffers from stack overflow while processing self referenced instruction in unreachable basic block. According [1] and [2] it's reasonable to make pass resistant against self referencing instructions. To fix issue we skip sinking analysis when we reach instruction coming from unreachable block. [1] https://groups.google.com/g/llvm-dev/c/843Tig9IzwA [2] https://lists.llvm.org/pipermail/llvm-dev/2015-February/082629.html Differential Revision: https://reviews.llvm.org/D113897	2022-05-10 16:06:12 +02:00
Florian Hahn	41e142fdc7	Recommit "[SimpleLoopUnswitch] Collect either logical ANDs/ORs but not both." This reverts commit `7211d5ce07`. This version fixes a crash that caused buildbot failures with the first version.	2022-05-09 13:49:12 +01:00
Alexander Shaposhnikov	f827ee671f	[Scalar][NFC] Minor cleanups in CallSiteSplitting.cpp	2022-05-06 23:03:49 +00:00
Florian Hahn	7211d5ce07	Revert "[SimpleLoopUnswitch] Collect either logical ANDs/ORs but not both." This reverts commit `db7a87ed4f`. This seems to cause a PPC buildbot failure: https://lab.llvm.org/buildbot#builders/93/builds/8787	2022-05-06 22:38:15 +01:00
Max Kazantsev	5a08e81779	[RS4GC] Add support for 'freeze' instruction to findBaseDefiningValue Because this instruction is a noop, we can simply go through it in search of the base.	2022-05-06 20:46:29 +07:00
Max Kazantsev	e6a7afae03	[NFC] Fix typo in assert message	2022-05-06 20:31:34 +07:00
Florian Hahn	db7a87ed4f	[SimpleLoopUnswitch] Collect either logical ANDs/ORs but not both. After D97756, collectHomogenousInstGraphLoopInvariants may collect conditions for both logical ANDs and logical ORs in case the root is a select that matches both logical AND & OR. This means the function won't return invariant values of either AND/OR chains, but both. This can result in incorrect transformations. See llvm/test/Transforms/SimpleLoopUnswitch/trivial-unswitch-logical-and-or.ll. Without the patch, Alive2 rejects the modified tests with: Source and target don't have the same return domain. Note that this also applies to the test case added in D97756 (@test_partial_condition_unswitch_or_select). We can't unswitch on %cond6, because the graph leading to it contains and AND and an OR. This only fixes trivial unswitching for now, but a similar problem likely exists with non-trivial unswitching. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D124526	2022-05-06 09:50:03 +01:00
Florian Hahn	6bd2b70877	[SimpleLoopUnswitch] Add freeze if branch execs for partial unswitching. We cannot skip the freezing the condition if the unswitched branch executes, if the condition is a chain of ANDs/ORs. For example, if if we have an AND %c1, %c2 with %c1 == undef and %c2 == 0, there would be no branch on undef in the original code, but a branch on undef if we unswitch %c1. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D124603	2022-05-05 09:44:07 +01:00
Dawid Jurczak	9c46a9cf61	[NFC][GVNSink] Don't pretend that iteration is over instructions when it's actually over blocks Differential Revision: https://reviews.llvm.org/D124764	2022-05-03 17:19:40 +02:00
David Green	6f81903e89	[LV][SLP] Mark fptosi_sat as vectorizable This adds fptosi_sat and fptoui_sat to the list of trivially vectorizable functions, mainly so that the loop vectorizer can vectorize the instruction. Marking them as trivially vectorizable also allows them to be SLP vectorized, and Scalarized. The signature of a fptosi_sat requires two type overrides (@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first operand of the intrinsic as a overloaded (but not scalar) operand. Differential Revision: https://reviews.llvm.org/D124358	2022-05-03 09:32:34 +01:00
Jonas Paulsson	304378fd09	Reapply "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls." (was `0f8c626`). This reverts commit `14d9390`. The patch previously failed to recognize cases where user had defined a function alias with an identical name as that of the library function. Module::getFunction() would then return nullptr which is what the sanitizer discovered. In this updated version a new function isLibFuncEmittable() has as well been introduced which is now used instead of TLI->has() anytime a library function is to be emitted . It additionally also makes sure there is e.g. no function alias with the same name in the module. Reviewed By: Eli Friedman Differential Revision: https://reviews.llvm.org/D123198	2022-05-02 19:37:00 +02:00
Florian Hahn	5387a38c38	[SimpleLoopUnswitch] Freeze individual OR/AND operands. In some cases, it is not enough to freeze the final AND/OR operation when chaining a number of invariant conditions together. After creating a chain of ANDs/ORs, we assume all unswitched operands to be either true or false. But if any of the operands is poison, the rest of the operands could have any value after branching on the frozen condition. To avoid that, freeze individual operands, if needed. In some cases this may lead to unnecessary freezes, but it seems required at least for some cases (see trivial-unswitch-freeze-individual-conditions.ll) Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D124554	2022-05-01 20:11:05 +01:00
Florian Hahn	8b022f87b0	[SimpleLoopUnswitch] Freeze trivial conditions if needed. Trivial unswitching can also introduce new branches on undef/poison. Freeze the conditions if needed. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D124549	2022-04-30 19:53:36 +01:00
James Y Knight	02aa795785	Revert "[JumpThreading][NFC][CompileTime] Do not recompute BPI/BFI analyzes" This change has caused non-reproducibility of a self-build of Clang when using NewPM and providing profile data. This reverts commit `35f38583d2`.	2022-04-29 21:15:47 +00:00
Anna Thomas	205246cb64	[CompileTime] [Passes] Avoid computing unnecessary analyses. NFC Similar to `c515b2f39e`, If there are no loops in the function as seen through LI, we should avoid computing the remaining expensive analyses (such as SCEV, BPI). Reordered the analyses requests and early return if there are no loops. The logic of avoiding expensive analyses is applied to LoopVectorizer, LoopLoadElimination and LoopUnrollPass, i.e. all function passes which operate on loops. This is an NFC with compile time improvement. Differential Revision: https://reviews.llvm.org/D124529	2022-04-29 10:00:06 -04:00
Florian Hahn	fb4113ef0c	[Passes] Remove legacy LoopUnswitch pass. The legacy LoopUnswitch pass is only used in the legacy pass manager pipeline, which is deprecated. The NewPM replacement is SimpleLoopUnswitch and I think it is time to remove the legacy LoopUnswitch code. Fixes #31000. Reviewed By: aeubanks, Meinersbur, asbirlea Differential Revision: https://reviews.llvm.org/D124376	2022-04-29 10:30:49 +01:00
Chris Jackson	c792884589	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] Reland `3f2b76ec90` with the test corrected to require x86-registered-target. Differential Revision: https://reviews.llvm.org/D120169	2022-04-28 14:21:56 +01:00
Chris Jackson	cd5f9efc4d	Revert "[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]" This reverts commit `3f2b76ec90`.	2022-04-28 14:07:31 +01:00
Chris Jackson	3f2b76ec90	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] Reland commit `74273d575f` following a fix for a memory leak. The DVIRecoveryRecord vectors now use unique_ptr. Differential Revision: https://reviews.llvm.org/D120169	2022-04-28 13:55:49 +01:00
Nikita Popov	b9dc565147	[GVN] Encode GEPs in offset representation When using opaque pointers, convert GEPs into offset representation of the form P + V1 * Scale1 + V2 * Scale2 + ... + ConstantOffset. This allows us to recognize equivalent address calculations even if the GEPs don't use the same source element type. This fixes an opaque pointer codegen regression seen in rustc. Differential Revision: https://reviews.llvm.org/D124527	2022-04-28 09:32:05 +02:00
Max Kazantsev	35f38583d2	[JumpThreading][NFC][CompileTime] Do not recompute BPI/BFI analyzes They can already be available, and even if not, DT/LI can be available. We should not recompute them. Old PM is unchanged because it would require changing dependencies, and we don't care enough about it. Differential Revision: https://reviews.llvm.org/D124439 Reviewed By: nikic, aeubanks	2022-04-28 10:46:08 +07:00
Wenju He	96d3be8443	[InferAddressSpaces] Check if AS are the same in isNoopPtrIntCastPair isNoopAddrSpaceCast is expecting SrcAS is different from DestAS. If the two AS are the same, consider ptrtoint/inttoptr as noop cast. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D123573	2022-04-28 11:10:55 +08:00
Kirill Stoimenov	761366e6ae	Revert "[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2]" This reverts commit `74273d575f`. Buildbot: https://lab.llvm.org/buildbot/#/builders/5/builds/22795 Failing with memory leak.	2022-04-27 23:11:48 +00:00
Anna Thomas	c515b2f39e	[IRCE] Avoid computing potentially unnecessary analyses. NFC IRCE is a function pass that operates on loops. If there are no loops in the function (as seen through LI), we should avoid computing the remaining expensive analyses (such as BPI). Reordered the analyses requests and early return if there are no loops. This is an NFC with compile time improvement. The same will be done in a follow-up patch for the loop vectorizer. Reviewed-By: nikic Differential Revision: https://reviews.llvm.org/D124478	2022-04-27 09:22:10 -04:00
Chris Jackson	74273d575f	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] This relands commit `8f550368b1`. The test is amended with REQUIRES: x86-registered-target, in line with the other debuginfo-scev-salvage tests. Differential Revision: https://reviews.llvm.org/D120169	2022-04-27 13:10:30 +01:00
Chris Jackson	855752e563	Revert [Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics[2/2] This reverts commit `8f550368b1`.	2022-04-27 13:06:03 +01:00
Chris Jackson	8f550368b1	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [2/2] Second of two patches to extend SCEV-based salvaging to dbg.value intrinsics that have multiple location ops pre-LSR. This second patch adds the core implementation. Reviewers: @StephenTozer, @djtodoro Differential Revision: https://reviews.llvm.org/D120169	2022-04-27 12:47:35 +01:00
Chris Jackson	c45e4c140f	[Debuginfo][LSR] Add salvaging variadic dbg.value intrinsics [1/2] [NFC] First of two patches that extends SCEV-based salvaging to enable salvaging of dbg.value instrinsics that have multiple locations ops before the Loop Strength Reduction pass. The existing single-op SCEV-based salvaging can generate variadic dbg.value intrinsics in order to salvage a dbg.value that has a single location op. If a dbg.value has multiple location ops before LSR, and LSR optimises away one or more of the location operands, then currently no salvaging will be attempted. Salvaging can now be added, but first this patch cleans up consistency in both the code and comments, and applies some refactoring to make application of the new salvaging implementation more straightforward. - Use SCEVDbgValueBuilder for both types of recovery expressions: IV-offset based and iteration count based. - Combine the functions that write the final DIExpression. - Move some static functions into member functions. Reviewers: @Orlando Differential Revision: https://reviews.llvm.org/D120168	2022-04-27 11:43:05 +01:00
Florian Hahn	c59d95f6a4	[ConstraintElimination] Check if const. is small enough before using it Check if the value of a ConstantInt is small enough to be used for solving before calling getSExtValue. Fixes #55085	2022-04-26 13:56:32 +01:00
Florian Hahn	857c612d89	[IPSCCP] Support unfeasible default dests for switch. At the moment, unfeasible default destinations are not handled properly in removeNonFeasibleEdges. So far, only unfeasible cases are removed, but later code expects unreachable blocks to have no predecessors. This is causing the crash reported in PR49573. If the default destination is unfeasible it won't be executed. Create a new unreachable block on demand and use that as default destination. Note that at the moment this only is relevant for cases where resolvedUndefsIn marks the first case as executable. Regular switch handling has a FIXME/TODO to support determining whether the default case is feasible or not. Fixes #48917. Differential Revision: https://reviews.llvm.org/D113497	2022-04-26 12:41:41 +01:00
Dmitry Makogon	d03d2d8aea	[RS4GC] Prune inputs of BDV if they are BDV themselves Don't check whether an input of BDV can be pruned if the input is the BDV itself. BDV is present in the states map, so in case the input is the BDV itself, we'd return false. So explicitly check this case. Differential Revision: https://reviews.llvm.org/D123846	2022-04-26 16:05:00 +07:00
David Green	9727c77d58	[NFC] Rename Instrinsic to Intrinsic	2022-04-25 18:13:23 +01:00
Florian Hahn	6a6cc5542b	[SimpleLoopUnswitch] Enable freezing of conditions by default. This fixes a series of mis-compiles by SimpleLoopUnswitch. My measurements showed no performance regression with -O3 on AArch64 in SPEC2006, SPEC2017 and a set of internal benchmarks. Fixes #50387, #50430 Depends on D124251. Reviewed By: nikic, aqjune Differential Revision: https://reviews.llvm.org/D124252	2022-04-25 14:26:41 +01:00
Max Kazantsev	606a000d1a	[LoopInstSimplify] Ignore users in unreachable blocks. PR55072 Logic in this pass assumes that all users of loop instructions are either in the same loop or are LCSSA Phis. In fact, there can also be users in unreachable blocks that currently break assertions. Such users don't need to go to the next round of simplifications. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D124368	2022-04-25 17:35:28 +07:00
Florian Hahn	b341c44010	[SimpleLoopUnswitch] Check if freeze is needed for partial unswitching. We only need to insert a Freeze instruction if any of the conditions may be poison. Similar checks are already done in the other places SimpleLoopUnswitch creates Freeze instruction. Reviewed By: aeubanks, efriedma Differential Revision: https://reviews.llvm.org/D124259	2022-04-22 21:24:55 +01:00
Fangrui Song	14d9390721	Revert D123198 "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls." test/Transforms/InstCombine/pr39177.ll failed in a -DLLVM_USE_SANITIZER=Undefined build. ``` lib/Transforms/Utils/BuildLibCalls.cpp:1217:17: runtime error: reference binding to null pointer of type 'llvm::Function' ``` `Function &F = *M->getFunction(Name);` This reverts commit `0f8c626723`.	2022-04-19 22:26:10 -07:00
Paul Kirth	bac6cd5bf8	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-04-19 21:23:48 +00:00
Jonas Paulsson	0f8c626723	[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls. A new set of overloaded functions named getOrInsertLibFunc() are now supposed to be used instead of getOrInsertFunction() when building a libcall from within an LLVM optimizer(). The idea is that this new function also makes sure that any mandatory argument attributes are added to the function prototype (after calling getOrInsertFunction()). inferLibFuncAttributes() is renamed to inferNonMandatoryLibFuncAttrs() as it only adds attributes that are not necessary for correctness but merely helping with later optimizations. Generally, the front end is responsible for building a correct function prototype with the needed argument attributes. If the middle end however is the one creating the call, e.g. when replacing one libcall with another, it then must take this responsibility. This continues the work of properly handling argument extension if required by the target ABI when building a lib call. getOrInsertLibFunc() now does this for all libcalls currently built by any LLVM optimizer. It is expected that when in the future a new optimization builds a new libcall with an integer argument it is to be added to getOrInsertLibFunc() with the proper handling. Note that not all targets have it in their ABI to sign/zero extend integer arguments to the full register width, but this will be done selectively as determined by getExtAttrForI32Param(). Review: Eli Friedman, Nikita Popov, Dávid Bolvanský Differential Revision: https://reviews.llvm.org/D123198	2022-04-19 21:22:07 +02:00
eop Chen	38ec33d6b9	[LSR] Update outdated comment	2022-04-16 12:11:15 -07:00
Florian Hahn	ad95255b92	Revert "[LICM] Only create load in pre-header when promoting load." This reverts commit `4bf3b7dc92`. This might be causing another buildbot failure.	2022-04-13 20:24:28 +02:00
Anna Thomas	28f27dd264	Check users of instrinsics instead of traversing entire function.NFC Updated LowerGuardIntrinsic and LowerWidenableCondition to check for users of the respective intrinsic, instead of checking for guards and widenable conditions by traversing the entire function. This is an NFC. Should save some compile time.	2022-04-13 12:28:51 -04:00
Florian Hahn	4bf3b7dc92	Recommit "[LICM] Only create load in pre-header when promoting load." This reverts the revert commit `1ddc719680`. This version of the patch sets the initial available value to poison, which resolves an issue with the SSAUpdater breaking LCSSA form.	2022-04-13 17:20:39 +02:00
Whitney Tsang	80304c5f88	[LoopUnroll] Always respect user unroll pragma IMO when user provide unroll pragma, compiler should always respect it. It is not clear to me why loop unroll pass currently ensure that the unrolled loop size is limited by PragmaUnrollThreshold. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D119148	2022-04-11 14:33:24 -04:00
Florian Hahn	1ddc719680	Revert "[LICM] Only create load in pre-header when promoting load." This reverts commit `42229b96bf`. This appears to cause crashes on multiple bots.	2022-04-11 17:37:23 +02:00
Florian Hahn	42229b96bf	[LICM] Only create load in pre-header when promoting load. When only a store is sunk, there is no need to create a load in the pre-header, as the result of the load will never get used. The dead load can can introduce UB, if the function is marked as writeonly. Fixes #51248. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123473	2022-04-11 16:45:18 +02:00
Arthur Eubanks	b22ffc7b98	[CaptureTracking] Ignore ephemeral values in EarliestEscapeInfo And thread DSE's ephemeral values to EarliestEscapeInfo. This allows more precise analysis in DSEState::isReadClobber() via BatchAA. Followup to D123162. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123342	2022-04-08 10:07:26 -07:00
Zaara Syeda	07005440ae	[LSR] Optimize unused IVs to final values in the exit block Loop Strength Reduce sometimes optimizes away all uses of an induction variable from a loop but leaves the IV increments. When the only remaining use of the IV is the PHI in the exit block, this patch will call rewriteLoopExitValues to replace the exit block PHI with the final value of the IV to skip the updates in each loop iteration. Differential Revision: https://reviews.llvm.org/D118808	2022-04-08 11:16:37 -04:00
Nikita Popov	c8c6362560	[LICM] Pass MemorySSAUpdater by referene (NFC) Make it clearer that this is a required dependency.	2022-04-08 10:08:57 +02:00
Nikita Popov	5cefe7d9f5	[LoopSink] Require MemorySSA This makes MemorySSA in LoopSink required, and removes the AST-based implementation, as well as the related support code in LICM. Differential Revision: https://reviews.llvm.org/D123288	2022-04-08 09:49:44 +02:00
Austin Kerbow	26b14c3ea7	[InferAddressSpaces] Fix assert on invalid bitcast placement Similar to the problem in `0bb25b4603`, bitcasts that are inserted must dominate all uses. When rewriting "values" with "new values" that have the updated address space, we may replace the "new value" with a bitcast if one of the original users is an addresspace cast. This bitcast must be inserted before ALL users, not only before the addresspace cast. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122964	2022-04-07 20:07:53 -07:00
Arthur Eubanks	17fdaccccf	[CaptureTracking] Ignore ephemeral values when determining pointer escapeness Ephemeral values cannot cause a pointer to escape. No change in compile time: https://llvm-compile-time-tracker.com/compare.php?from=4371710085ba1c376a094948b806ddd3b88319de&to=c5ddbcc4866f38026737762ee8d7b9b00395d4f4&stat=instructions This partially fixes some regressions caused by more calls to `__builtin_assume` (D122397). Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D123162	2022-04-07 10:11:14 -07:00
Nikita Popov	e22af03a79	[Sink] Don't sink non-willreturn calls (PR51188) Fixes https://github.com/llvm/llvm-project/issues/51188.	2022-04-07 16:35:05 +02:00
Nikita Popov	674ee4d353	[LoopSink] Use MemorySSA with legacy pass manager LoopSink with the legacy pass manager still uses AST, because we can't compute MemorySSA conditionally. I think now that the legacy pass manager will be removed soon(TM) we don't need to care about compile-time impact here anymore. Additionally, since MemorySSA is no longer eagerly optimized, the impact is actually not that high anymore (~0.2% geomean regression on CTMark). This just makes legacy PM and new PM behavior line up -- as a followup I'll drop these options entirely and make MemorySSA use mandatory. Differential Revision: https://reviews.llvm.org/D123216	2022-04-07 09:40:29 +02:00
Matt Arsenault	39f1568633	Transforms: Split LowerAtomics into separate Utils and pass This will allow code sharing from AtomicExpandPass. Not entirely sure why these exist as separate passes though.	2022-04-06 20:54:45 -04:00
Alina Sbirlea	08075a7ee8	Revert `f7381a795a` Roll-forward `29fada4a3d`. Issue triggered was due to UB. Differential Revision: https://reviews.llvm.org/D121987	2022-04-06 16:06:14 -07:00
Congzhe Cao	eac3487510	[LoopInterchange] Try to achieve the most optimal access pattern after interchange Motivated by pr43326 (https://bugs.llvm.org/show_bug.cgi?id=43326), where a slightly modified case is as follows. void f(int e[10][10][10], int f[10][10][10]) { for (int a = 0; a < 10; a++) for (int b = 0; b < 10; b++) for (int c = 0; c < 10; c++) f[c][b][a] = e[c][b][a]; } The ideal optimal access pattern after running interchange is supposed to be the following void f(int e[10][10][10], int f[10][10][10]) { for (int c = 0; c < 10; c++) for (int b = 0; b < 10; b++) for (int a = 0; a < 10; a++) f[c][b][a] = e[c][b][a]; } Currently loop interchange is limited to picking up the innermost loop and finding an order that is locally optimal for it. However, the pass failed to produce the globally optimal loop access order. For more complex examples what we get could be quite far from the globally optimal ordering. What is proposed in this patch is to do a "bubble-sort" fashion when doing interchange. By comparing neighbors in `LoopList` in each iteration, we would be able to move each loop onto a most appropriate place, hence this is an approach that tries to achieve the globally optimal ordering. The motivating example above is added as a test case. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D120386	2022-04-06 15:31:56 -04:00
Martin Storsjö	46776f7556	Fix warnings about variables that are set but only used in debug mode Add void casts to mark the variables used, next to the places where they are used in assert or `LLVM_DEBUG()` expressions. Differential Revision: https://reviews.llvm.org/D123117	2022-04-06 10:01:46 +03:00
Bert Abrath	019e7b7f6e	[PartiallyInlineLibCalls] Don't partially inline a musttail libcall. Partially inlining a libcall that has the musttail attribute leads to broken LLVM IR, triggering an assertion in the IR verifier. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D123116	2022-04-05 22:30:50 +03:00
serge-sans-paille	1e02737593	[iwyu] Fix some header include regression Running iwyu-diff from https://github.com/serge-sans-paille/preprocessor-utils makes it possible to quickly spot regression in unused includes. This patch contains the few regressions since the last header cleanup. Differential Revision: https://reviews.llvm.org/D123036	2022-04-05 15:02:03 +02:00
Nikita Popov	a5c3b5748c	[MemCpyOpt] Work around PR54682 As discussed on https://github.com/llvm/llvm-project/issues/54682, MemorySSA currently has a bug when computing the clobber of calls that access loop-varying locations. I think a "proper" fix for this on the MemorySSA side might be non-trivial, but we can easily work around this in MemCpyOpt: Currently, MemCpyOpt uses a location-less getClobberingMemoryAccess() call to find a clobber on either the src or dest location, and then refines it for the src and dest clobber. This was intended as an optimization, as the location-less API is cached, while the location-affected APIs are not. However, I don't think this really makes a difference in practice, because I don't think anything will use the cached clobbers on those calls later anyway. On CTMark, this patch seems to be very mildly positive actually. So I think this is a reasonable way to avoid the problem for now, though MemorySSA should also get a fix. Differential Revision: https://reviews.llvm.org/D122911	2022-04-04 10:19:51 +02:00
Nikita Popov	c0cc98251a	[Float2Int] Make sure dependent ranges are calculated first (PR54669) The range calculation in walkForwards() assumes that the ranges of the operands have already been calculated. With the used visit order, this is not necessarily the case when there are multiple roots. (There is nothing guaranteeing that instructions are visited in topological order.) Fix this by queuing instructions for reprocessing if the operand ranges haven't been calculated yet. Fixes https://github.com/llvm/llvm-project/issues/54669. Differential Revision: https://reviews.llvm.org/D122817	2022-04-04 10:18:39 +02:00
Philip Reames	7c51669c21	[memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time The search for the clobbering call is fairly expensive if uses are not optimized at construction. Defer the clobber walk to the point in the implementation we need it; there are a bunch of bailouts before that point. (e.g. If the source pointer is not an alloca, we can't do callslotopt.) On a test case which involves a bunch of copies from argument pointers, this switches memcpyopt from > 1/2 second to < 10ms.	2022-04-03 20:16:20 -07:00
Xiang1 Zhang	f830392be7	Correct spelling error in TLS-Load-Hoist	2022-04-04 08:27:54 +08:00
Florian Hahn	5bedc1f093	[ConstraintElimination] Move logic to build worklist to helper (NFC). This refactor makes it easier to extend the logic to collect information from blocks in the future, without even further increasing the size of eliminateConstriants.	2022-04-02 16:55:05 +01:00
Xiang1 Zhang	a56f264958	Refine tls-load-hoista llvm option Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D122890	2022-04-01 19:03:58 +08:00
Jorge Gorbe Moya	fc7573f29c	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `46774df307`.	2022-03-31 14:54:41 -07:00
Paul Kirth	46774df307	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-03-31 17:38:21 +00:00
Nikita Popov	33ac23e7cf	[Float2Int] Avoid unnecessary lamdbas (NFC) Instead of first creating a lambda for calculating the range, then collecting the ranges for the operands, and then calling the lambda on those ranges, we can first calculate the operand ranges and then calculate the result directly in the switch.	2022-03-31 16:13:13 +02:00
Nikita Popov	f66975555f	[Float2Int] Extract calcRange() method (NFC) This avoids the awkward "Abort" flag, because we can simply early-return instead.	2022-03-31 16:13:13 +02:00
Aditya Kumar	368681f803	[GVNHoist] drop debug location according to the debug info guide According to the LLVM debug info update guide: https://llvm.org/docs/HowToUpdateDebugInfo.html, "Hoisting identical instructions which appear in several successor blocks into a predecessor block. In this case there is no single merged instruction. The rule for dropping locations applies". Thanks to Yuanbo Li for reporting this. Reviewed By: dblaikie Reviewers: sebpop, tejohnson, dblaikie Differential Revision: https://reviews.llvm.org/D122730	2022-03-30 20:17:53 -07:00
Stephen Long	e02f4976ac	[LoopIdiom] Merge TBAA of adjacent stores when creating memset Factor in the TBAA of adjacent stores instead of just the head store when merging stores into a memset. We were seeing GVN remove a load that had a TBAA that matched the 2nd store because GVN determined it didn't match the TBAA of the memset. The memset had the TBAA of only the first store. i.e. Loading the field pi_ of shared_count after memset to create an array of shared_ptr template<class T> class shared_ptr { T p; shared_count refcount; }; class shared_count { sp_counted_base pi_; }; Differential Revision: https://reviews.llvm.org/D122205	2022-03-30 16:54:49 -07:00
Chang-Sun Lin Jr	c28ce745cf	Value-number GVNHoist loads by result type as well as pointer address. Avoids merge errors when opaque pointers are loaded into different types. Reviewed by: jcranmer-intel, hiraditya Differential Revision: https://reviews.llvm.org/D122521	2022-03-30 11:33:49 -07:00
Florian Hahn	3dbb5eb2cd	[ConstraintElimination] Move ConstraintInfo after ConstraintTy. (NFC) Code movement to it slightly easier to use ConstraintTy & co in ConstraintInfo directly, for follow-up patches.	2022-03-29 09:59:03 +01:00
Serguei Katkov	6444a65514	[LSR] Fixup canonicalization formula and its checker. According to definition of canonical form, it is a canonical if scale reg does not contain addrec for loop L then none of bases should contain addrec for this loop. The critical word here is "contains". Current checker of canonical form checks not "containing" property but "is". So it does not check whether it contains but whether it is. Fix the checker and canonicalizing utility to follow definition. Without this fix in the test attached the base formula looking as reg((-1 * {0,+,8}<nuw><nsw><%bb2>)<nsw>) + 1reg((8 (%arg /u 8))<nuw>) is considered as conanocial while base contains an addrec. And modified formula we want to insert reg({0,+,8}<nuw><nsw><%bb2>) + 1reg((-8 (%arg /u 8))) is considered as not canonical. Reviewed By: mkazantsev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D122457	2022-03-29 14:05:04 +07:00
Paul Kirth	90cb325abd	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `2add3fbd97`.	2022-03-29 06:20:30 +00:00
Philip Reames	33deaa13b8	[memcpyopt] Common code into performCallSlotOptzn [NFC] We have the same code repeated in both callers, sink it into callee. The motivation here isn't just code style, we can also defer the relatively expensive aliasing checks until the cheap structural preconditions have been validated. (e.g. Don't bother aliasing if src is not an alloca.) This helps compile time significantly.	2022-03-28 20:10:13 -07:00
Paul Kirth	2add3fbd97	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-03-28 23:30:04 +00:00
Alina Sbirlea	f7381a795a	Revert `29fada4a3d` Seeing a test failure with asan in Halide generated code, reverting while I investigate. Differential Revision: https://reviews.llvm.org/D121987	2022-03-28 16:17:41 -07:00
Florian Hahn	8c3281db49	[ConstraintElimination] Use AddOverflow for offset summation. Fixes an incorrect transformation due to values overflowing https://alive2.llvm.org/ce/z/uizoea	2022-03-25 18:08:24 +00:00
Djordje Todorovic	9dbc687a5e	NFC: [LICM] Update some stale comments After removing the MaybePromotable, some comments became stale. This improves them. Differential Revision: https://reviews.llvm.org/D122319	2022-03-24 14:37:20 +01:00
Nikita Popov	29fada4a3d	[EarlyCSE] Don't eagerly optimize MemoryUses EarlyCSE currently optimizes all MemoryUses upfront. However, EarlyCSE only actually queries the clobbering memory access for a subset of uses, namely those where a CSE candidate has already been identified. Delaying use optimization to the clobber query improves compile-time in practice. This change is not NFC because EarlyCSE has a limit on the number of clobber queries (EarlyCSEMssaOptCap), in which case it falls back to the defining access. The defining access for uses will now no longer coincide with the optimized access. If there are performance regressions from this change, we should be able to address them by raising this limit. Differential Revision: https://reviews.llvm.org/D121987	2022-03-23 16:47:35 +01:00
Nikita Popov	afb526b3f4	[LICM] Handle store of pointer to itself (PR54495) Rather than iterating over users and comparing operands, iterate over uses and check operand number. Otherwise, we'll end up promoting a store twice if it has two equal operands. This can only happen with opaque pointers, as otherwise both operands differ by a level of indirection, so a bitcast would have to be involved. Fixes https://github.com/llvm/llvm-project/issues/54495.	2022-03-22 14:00:07 +01:00
Philip Reames	ee7324b898	Rename mayBeMemoryDependent to mayHaveNonDefUseDependency [nfc]	2022-03-21 10:01:40 -07:00
psamolysov-intel	2ed030ba88	[InferAddressSpaces][NFC] Small code improvements for the InferAddressSpaces pass There is a bunch of code improvements in the patch: marking as const everything what can be const and fixing some typos in comments. Also the patch removes the shadowing parameter TTI from the rewriteWithNewAddressSpaces method, the TTI parameter is not required because the same field is in the class. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D121671	2022-03-21 11:03:12 -05:00
Kazu Hirata	bce1bf0ee2	[Transform] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)	2022-03-20 10:41:22 -07:00
Nick Desaulniers	e1bae23f6f	[SCCP] do not clean up dead blocks that have their address taken [SCCP] do not clean up dead blocks that have their address taken Fixes a crash observed in IPSCCP. Because the SCCPSolver has already internalized BlockAddresses as Constants or ConstantExprs, we don't want to try to update their Values in the ValueLatticeElement. Instead, continue to propagate these BlockAddress Constants, continue converting BasicBlocks to unreachable, but don't delete the "dead" BasicBlocks which happen to have their address taken. Leave replacing the BlockAddresses to another pass. Fixes: https://github.com/llvm/llvm-project/issues/54238 Fixes: https://github.com/llvm/llvm-project/issues/54251 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121744	2022-03-18 11:02:15 -07:00
Florian Hahn	5ab421fb4e	[LICM] Add allowspeculation pass options. This adds a new option to control AllowSpeculation added in D119965 when using `-passes=...`. This allows reproducing #54023 using opt. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D121944	2022-03-18 16:51:57 +00:00
Nikita Popov	ab2284a643	[LowerConstantIntrinsics] Make TLI a required dependency The way the pass is actually used in the optimization pipeline, TLI will be available, but this is not the case when running just -lower-constant-intrinsics in tests, which ends up being quite confusing. Require TLI unconditionally, as we usually do.	2022-03-18 14:59:18 +01:00
Nikita Popov	f96428e16d	[MemorySSA] Don't optimize uses during construction This changes MemorySSA to be constructed in unoptimized form. MemorySSA::ensureOptimizedUses() can be called to optimize all uses (once). This should be done by passes where having optimized uses is beneficial, either because we're going to query all uses anyway, or because we're doing def-use walks. This should help reduce the compile-time impact of MemorySSA for some use cases (the reason why I started looking into this is D117926), which can avoid optimizing all uses upfront, and instead only optimize those that are actually queried. Actually, we have an existing use-case for this, which is EarlyCSE. Disabling eager use optimization there gives a significant compile-time improvement, because EarlyCSE will generally only query clobbers for a subset of all uses (this change is not included in this patch). Differential Revision: https://reviews.llvm.org/D121381	2022-03-18 09:56:16 +01:00
Florian Hahn	4a699ae9c6	[LoopSimplifyCFG] Check predecessors of exits before marking them dead. LoopSimplifyCFG may process loops that are not in loop-simplify/canonical form. For loops not in canonical form, exit blocks may be reachable from non-loop blocks and we cannot consider them as dead if they only are not reachable from the loop itself. Unfortunately the smallest test I could come up with requires running multiple passes: -passes='loop-mssa(loop-instsimplify,loop-simplifycfg,simple-loop-unswitch)' The reason is that loops are canonicalized at the beginning of loop pipelines, so a later transform has to break canonical form in a way that breaks LoopSimplifyCFG's dead-exit analysis. Alternatively we could try to require all loop passes to maintain canonical form. That in turn would also require additional verification. Fixes #54023, #49931. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D121925	2022-03-18 08:54:44 +00:00

1 2 3 4 5 ...

11781 Commits