llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	569d84fe99	[VPlan] Remove dead recipes across whole plan. This extends removeDeadRecipe to remove recipes across the whole plan. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D127580	2022-06-23 13:36:02 +02:00
Serguei Katkov	8f891b7c39	[LoopVectorize] Uninitialized phi node leads to a crash in SSAUpdater. createInductionResumeValues creates a phi node placeholder without filling incoming values. Then it generates the incoming values. It includes triggering of SCEV expander which may invoke SSAUpdater. SSAUpdater has an optimization to detect number of predecessors basing on incoming values if there is phi node. In case phi node is not filled with incoming values - the number of predecessors is detected as 0 and this leads to segmentation fault. In other words SSAUpdater expects that phi is in good shape while LoopVectorizer breaks this requirement. The fix is just prepare all incoming values first and then build a phi node. Reviewed By: fhahn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D128033	2022-06-22 10:49:27 +07:00
Philip Reames	8ae0664282	LoopVect, tests] Add some basic coverage for scalable costing of scatter/gather patterns on RISCV This just adds some very basic vectorizer testing with both fixed and scalable vectorization enabled.	2022-06-21 13:54:53 -07:00
Philip Reames	2cf320d41e	[LoopVect, tests] Add some basic coverage for scalable costing on RISCV This just adds some very basic vectorizer testing with both fixed and scalable vectorization enabled. For context, I just yesterday fixed a crash in costing of the splat_ptr example - see bbf3fd.	2022-06-21 13:35:38 -07:00
Florian Hahn	88ce403c6a	[LV] Add new block to place recurrence splice, if needed. In some cases, a recurrence splice instructions needs to be inserted between to regions, for example if the regions get re-arranged during sinking. Fixes #56146.	2022-06-21 21:54:37 +02:00
Florian Hahn	e9cced2739	Recommit "[LAA] Initial support for runtime checks with pointer selects." This reverts commit `7aa8a67882`. This version includes fixes to address issues uncovered after the commit landed and discussed at D11448. Those include: * Limit select-traversal to selects inside the loop. * Freeze pointers resulting from looking through selects to avoid branch-on-poison.	2022-06-17 21:06:26 +02:00
Malhar Jajoo	6bb40552f2	[LoopVectorize] Add support for invariant stores of ordered reductions Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D126772	2022-06-17 14:56:21 +01:00
Tiehu Zhang	b329156f4f	[AArch64][LV] AArch64 does not prefer vectorized addressing TTI::prefersVectorizedAddressing() try to vectorize the addresses that lead to loads. For aarch64, only gather/scatter (supported by SVE) can deal with vectors of addresses. This patch specializes the hook for AArch64, to return true only when we enable SVE. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D124612	2022-06-17 18:32:50 +08:00
Florian Hahn	467491202e	[LV] Update test to use GEP so it is not dead. The test should use the GEP for the store, so it is not dead.	2022-06-12 16:57:47 +01:00
Philip Reames	f7bb691d61	[RISCV] Implement isElementTypeLegalForScalableVector TTI hook This brings us into alignment with AArch64, and in the process fixes a compiler crash bug in uniform store handling in the vectorizer. Before the recent invalid cost bailout work, this would have also avoided crashes on invalid costs in some cases. I honestly think the vectorizer should gracefully bailout on uniform stores it can't use a scatter for, but it doesn't, so lets take the path of least resistance here. It's also possible that there are other vectorizer bugs AArch64 isn't seeing because of this hook; we don't want to be finding them either. Differential Revision: https://reviews.llvm.org/D127514	2022-06-10 13:20:58 -07:00
Philip Reames	0e29a80fdc	[RISCV] Add cost model for reverse shuffle The majority of the cost appears to be forming the indices vector. Differential Revision: https://reviews.llvm.org/D127141	2022-06-09 07:21:40 -07:00
Florian Hahn	20d798bd47	Recommit "[SCEV] Look through single value PHIs." (take 3) This reverts commit `1fbdbb5595`. All known issues surfaced by this patch should have been fixed now. The fixes included fixing issues with SCEV expansion in LV and DA's reliance on LCSSA phis.	2022-06-09 15:20:10 +01:00
Florian Hahn	85983ca42e	[VPlan] Replace remaining use of needsScalarIV. All information is already available in VPlan. Note that there are some test changes, because we now can correctly look through instructions like truncates to analyze the actual users. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D123541	2022-06-09 12:05:37 +01:00
Florian Hahn	3d663308a5	[LV] Add test that caused revert of D123720.	2022-06-08 12:25:17 +01:00
David Sherwood	997ecb0036	[LoopVectorize] Add FastMathFlags to the select used for reductions with tail-folding Based on reviewer comments on https://reviews.llvm.org/D126692 I've added FastMathFlags to the select instruction used when tail-folding with reductions. These flags can then be used by InstCombine to decide upon the most optimal floating point identity value for fadd/fsub. Doing so unlocks further optimisations, such as folding selects into masked loads. Differential Revision: https://reviews.llvm.org/D126778	2022-06-07 10:21:31 +01:00
Philip Reames	6071de3db6	[RISCV] Autogen a test for ease of update	2022-06-06 12:44:34 -07:00
Florian Hahn	eaf48dd9b0	[VPlan] Replace BranchOnCount with BranchOnCond if TC <= UF * VF. Try to simplify BranchOnCount to `BranchOnCond true` if TC <= UF * VF. This is an alternative to D121899 which simplifies the VPlan directly instead of doing so late in code-gen. The potential benefit of doing this in VPlan is that this may help cost-modeling in the future. The reason this is done in prepareToExecute at the moment is that a single plan may be used for multiple VFs/UFs. There are further simplifications that can be applied as follow ups: 1. Replace inductions with constants 2. Replace vector region with regular block. Fixes #55354. Depends on D126679. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D126680	2022-06-06 09:38:53 +01:00
yanming	8d9d8f866a	[RISCV] Define risc-v's own register class to model FP Register. The default RegisterClass is not enough to model RISCV Register. We define risc-v's own register class to model FP Register. This helps to better estimate the register pressure in the loop-vectorize. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D126854	2022-06-06 14:43:52 +08:00
Florian Hahn	a5bb4a3b4d	[VPlan] Replace CondBit with BranchOnCond VPInstruction. This patch removes CondBit and Predicate from VPBasicBlock. To do so, the patch introduces a new branch-on-cond VPInstruction opcode to model a branch on a condition explicitly. This addresses a long-standing TODO/FIXME that blocks shouldn't be users of VPValues. Those extra users can cause issues for VPValue-based analyses that don't expect blocks. Addressing this fixme should allow us to re-introduce `266ea446ab`. The generic branch opcode can also be used in follow-up patches. Depends on D123005. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D126618	2022-06-03 11:48:31 +01:00
Florian Hahn	72aca94b90	[LV] Add additional tests for pointer select support. Additional test cases for D114487.	2022-06-01 21:19:03 +01:00
Florian Hahn	05776122b6	[VPlan] Use region for each loop in native path. This patch updates the VPlan native path to use VPRegionBlocks for all loops in a loop nest. Up to now, only the outermost loop used a region. This is a step towards unifying both paths and keep things consistent between them. It also prepares various code-gen parts for modeling the pre-header in the inner loop vectorizer (D121624). Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D123005	2022-06-01 10:41:05 +01:00
Nikita Popov	03aceab08b	[ValueTracking] Enable -branch-on-poison-as-ub by default Now that SimpleLoopUnswitch and other transforms no longer introduce branch on poison, enable the -branch-on-poison-as-ub option by default. The practical impact of this is mostly better flag preservation in SCEV, and some freeze instructions no longer being necessary. Differential Revision: https://reviews.llvm.org/D125299	2022-06-01 10:46:06 +02:00
Nikita Popov	36cbdaa163	[InstCombine] Fix inbounds preservation when swapping GEPs (PR44206) When reassociating GEPs, we can only keep inbounds if both original GEPs were inbounds, and their offsets have the same sign. For the sake of simplicity, I only handle the case where both offsets are non-negative here. It would probably be fine to just not preserve inbounds at all here, but as I don't see a compile-time impact for adding the isKnownNonNegative() calls I went with this more conservative approach. Fixes https://github.com/llvm/llvm-project/issues/44206. Differential Revision: https://reviews.llvm.org/D126687	2022-05-31 15:45:02 +02:00
Florian Hahn	b7d2b160c3	[VPlan] Add test for printing VPlan for outer loop vectorization. Test coverage for D123005.	2022-05-30 18:19:52 +01:00
Nikita Popov	a770f534e6	[InstCombine] When swapping GEPs, only keep inbounds if both are If only one of the GEPs is inbounds, then after swapping, there is no guarantee that one of them will be inbounds as well (see e.g. https://alive2.llvm.org/ce/z/agaCnp). This is only a partial fix, because even if both are inbounds, the result is not necessarily inbounds (if the offsets have different signs).	2022-05-30 17:04:42 +02:00
Liqin.Weng	a84026821b	[RISCV] Add test for experimental.vector.reverse ``` void vector_reverse_i64(int A, int B, int n) { #pragma clang loop vectorize_width(4, scalable) for (int i = n-1; i >= 0; i--) A[i] = B[i] + 1; } ``` When option: scalable-vectorization is on (or set #pragma clang loop vectorize_width(elements, scalable)), Reverse Iterators can't loop vectorization as <vscale x elements x elementType> Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125866	2022-05-27 06:30:07 +00:00
Ivan Kosarev	ad1d60c3be	[FileCheck] Catch missspelled directives. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D125604	2022-05-26 11:37:19 +01:00
David Green	75631438e3	[AArch64] Costmodel tests for llvm.vscale intrinsics. NFC These shows that the cost of a @llvm.vscale is indeed 1, not 10.	2022-05-26 10:16:21 +01:00
David Sherwood	87936c7b13	[LoopVectorize] Fix assertion failure in fixReduction when tail-folding When compiling the attached new test in scalable-reductions-tf.ll we were hitting this assertion in fixReduction: Assertion `isa<PHINode>(U) && "Reduction exit must feed Phi's or select" The loop contains a reduction and an intermediate store of the reduction value. When vectorising with tail-folding the contains of 'U' in the assertion above happened to be a scatter_store. It turns out that we were still creating a widen recipe for the invariant store, despite knowing that we can actually sink it. The simplest fix is to change buildVPlanWithVPRecipes so that we look for invariant stores before attempting to widen it. Differential Revision: https://reviews.llvm.org/D126295	2022-05-25 11:46:32 +01:00
Jingu Kang	bb82f74612	Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"" This reverts commit `42ebfa8269`. The commmit from https://reviews.llvm.org/D125918 has fixed the stage 2 build failure. Differential Revision: https://reviews.llvm.org/D118979	2022-05-23 16:15:45 +01:00
Peter Waller	ade47bdc31	[LV] Improve register pressure estimate at high VFs Previously, `getRegUsageForType` was implemented using `getTypeLegalizationCost`. `getRegUsageForType` is used by the loop vectorizer to estimate the register pressure caused by using a vector type. However, `getTypeLegalizationCost` currently only appears to understand splitting and not scalarization, so significantly underestimates the register requirements. Instead, use `getNumRegisters`, which understands when scalarization can occur (via computeRegisterProperties). This was discovered while investigating D118979 (Set maximum VF with shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the loop vectorizer previously ends up costing an v128i1 as 2 v64i* registers where it actually occupies 128 i32 registers. I'm sending this patch early for comment, I'm still doing some sanity checking with LNT. I note that getRegisterClassForType appears to return VectorRC even though the type in question (large vNi1 types) end up occupying scalar registers. That might be worth fixing too. Differential Revision: https://reviews.llvm.org/D125918	2022-05-23 07:57:45 +00:00
Florian Hahn	419e49621f	[LV] Add check line to test interleaving only with induction cast. Also simplify the value names a bit in the test.	2022-05-22 20:11:47 +01:00
Florian Hahn	145fe57106	[LV] Use exiting block instead of latch in addUsersInExitBlock. The latch may not be the exiting block. Use the exiting block instead when looking up the incoming value of the LCSSA phi node. This fixes a crash with early-exit loops.	2022-05-22 18:27:41 +01:00
Florian Hahn	c230ab6db8	[LV] Re-generate check lines for loop-form.ll test.	2022-05-22 18:20:33 +01:00
Florian Hahn	97590baead	[LV] Widen ptr-inductions with scalar uses for scalable VFs. Current codegen only supports scalarization of pointer inductions for scalable VFs if they are uniform. After `3bebec659` we now may enter the scalarization code path in VPWidenPointerInductionRecipe::execute for scalable vectors. Fall back to widening for scalable vectors if necessary. This should fix a build failure when bootstrapping LLVM with SVE, e.g. https://lab.llvm.org/buildbot/#/builders/176/builds/1723	2022-05-22 16:24:13 +01:00
Florian Hahn	3bebec6592	[VPlan] Model first exit values using VPLiveOut. This patch introduces a new VPLiveOut subclass of VPUser to model exit values explicitly. The initial version handles exit values that are neither part of induction or reduction chains nor first order recurrence phis. Fixes #51366, #54867, #55167, #55459 Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D123537	2022-05-21 16:01:38 +01:00
Florian Hahn	a84896f270	[LV] Precommit test for PR55167. Test for #55167.	2022-05-21 16:01:33 +01:00
Florian Hahn	cd61d4bd2f	[LV] Do not LoopSimplify/LCSSA after generating main vector loop. At the moment LV runs LoopSimplify and reconstructs LCSSA form after generating the main vector loop and before generating the epilogue vector loop. In practice, this adds a new exit block for the scalar loop because the middle block now also branches to the original exit block of the scalar loop. It also requires adding a new LCSSA phi in the newly created exit block. This complicates things when modeling exit values in VPlan, because we would need to update the VPlan for the epilogue loop to update the newly created LCSSA phi node. But none of that should be necessary, as all analysis requiring loop-simplify form is already done at this point and LCSSA form of the original loop is not broken. Reviewed By: bmahjour Differential Revision: https://reviews.llvm.org/D125810	2022-05-20 09:58:40 +01:00
Florian Hahn	c90235f0ef	[LV] Drop wrap flags for reductions using VP def-use chain. Update clearReductionWrapFlags to use the VPlan def-use chain from the reduction phi recipe to drop reduction wrap flags. This addresses an existing FIXME and fixes a crash when instructions in the reduction chain are not used and have been removed before VPlan codegeneration. Fixes #55540.	2022-05-19 20:36:46 +01:00
Tiehu Zhang	3ed9f603fd	[LoopVectorize] Don't interleave when the number of runtime checks exceeds the threshold The runtime check threshold should also restrict interleave count. Otherwise, too many runtime checks will be generated for some cases. Reviewed By: fhahn, dmgreen Differential Revision: https://reviews.llvm.org/D122126	2022-05-19 23:29:00 +08:00
Tiehu Zhang	94a2bd5a27	[LoopVectorize] Precommit a test for D122126	2022-05-19 23:28:39 +08:00
lizhijin	90ea81fcb2	[LV] Widen freeze instead of scalarizing it This patch changes the strategy for vectorizing freeze instrucion, from replicating multiple times to widening according to selected VF. Fixes #54992 Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D125016	2022-05-19 12:28:01 +08:00
Florian Hahn	d92cec4c96	[LV] Regenerate check lines for some tests. Make sure the auto-generated check lines are up-to-date for some files, to reduce the test diff in upcoming changes	2022-05-17 17:45:01 +01:00
Nikita Popov	356d47ccb9	[ValueTracking] Handle and/or on RHS of isImpliedCondition() isImpliedCondition() currently handles and/or on the LHS, but not on the RHS, resulting in asymmetric behavior. This patch adds two new implication rules: * LHS ==> (RHS1 \|\| RHS2) if LHS ==> RHS1 or LHS ==> RHS2 * LHS ==> !(RHS1 && RHS2) if LHS ==> !RHS1 or LHS ==> !RHS2 Differential Revision: https://reviews.llvm.org/D125551	2022-05-16 16:30:26 +02:00
Florian Hahn	b7315ffc3c	[LAA,LV] Add initial support for pointer-diff memory checks. This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations and checks than the existing full overlap checking, if it is applicable. The main idea is to only check if source and sink of a dependency are far enough apart so the accesses won't overlap in the vector loop. To do so, it is sufficient to compute the difference and compare it to the `VF * UF * AccessSize`. It is sufficient to check `(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards dependence in the vector loop with the given VF and UF. If Src >=u Sink, there is not dependence preventing vectorization, hence the overflow should not matter and using the ULT should be sufficient. Note that the initial version is restricted in multiple ways: 1. Pointers must only either be read or written, by a single instruction (this allows re-constructing source/sink for dependences with the available information) 2. Source and sink pointers must be add-recs, with matching steps 3. The step must be a constant. 3. abs(step) == AccessSize. Most of those restrictions can be relaxed in the future. See https://github.com/llvm/llvm-project/issues/53590. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D119078	2022-05-16 15:27:22 +01:00
David Sherwood	befc952045	[LoopVectorize] Permit tail-folding for low trip counts using scalable vectors When the loop vectoriser encounters a known low trip count it tries to create a single predicated loop in order to get the benefit of vectorisation and eliminate the scalar tail. However, until now the vectoriser prevented the use of scalable vectors in this case due to concerns in the past about stability. I believe that tail-folded loops using scalable vectors are now sufficiently well tested that we can enable this. For the same reason I've also enabled it when optimising for code size too. Tests added here: Transforms/LoopVectorize/AArch64/sve-low-trip-count.ll Transforms/LoopVectorize/AArch64/sve-tail-folding-optsize.ll Transforms/LoopVectorize/RISCV/low-trip-count.ll Differential Revision: https://reviews.llvm.org/D121595	2022-05-16 09:14:24 +01:00
Florian Hahn	8b7c3d2179	[LV] Set SCEVCheckCond to nullptr whenever it was used. Under some circumstances, SCEVExpander will insert new instructions when expanding a predicate, but the final result of the expansion can be a false constant. In those cases, the expanded instructions may later be used by other expansions, e.g. the trip count. This may trigger an assertion during SCEVExpander cleanup. To avoid this, always mark the result as used. Fixes #55100.	2022-05-15 21:52:07 +01:00
Florian Hahn	39552964e1	[VPlan] Improve printing of VPReplicateRecipe with calls. Suggested as part of D124718.	2022-05-15 15:51:26 +01:00
Nikita Popov	0c00dbb975	[LoopVectorize] Regenerate test checks (NFC)	2022-05-13 16:41:48 +02:00
David Sherwood	92c645b5c1	[LoopVectorize] Add overflow checks when tail-folding with scalable vectors In InnerLoopVectorizer::getOrCreateVectorTripCount there is an assert that the known minimum value for the VF is a power of 2 when tail-folding is enabled. However, for scalable vectors the value of vscale may not be a power of 2, which means we have to worry about the possibility of overflow. I have solved this problem by adding preheader checks that prevent us from entering the vector body if the canonical IV would overflow, i.e. if ((IntMax - TripCount) < (VF * UF)) ... skip vector loop ... Differential Revision: https://reviews.llvm.org/D125235	2022-05-13 14:09:43 +01:00
Florian Hahn	38189438b6	[LV] Add crashing test from #55096 .	2022-05-12 22:40:28 +01:00
Florian Hahn	635b752211	[VPlan] VPInterleaveRecipe only uses first lane if op not stored. With opaque pointers, both the stored value and the address can be the same. Only consider the recipe using the first lane only if the address is not stored. Fixes #55375.	2022-05-11 11:24:56 +01:00
Florian Hahn	e79c1962b9	[LV] Add opaque pointer test for #55375 .	2022-05-11 11:24:52 +01:00
Nikita Popov	ff20ee32d8	[LoopVectorize] Remove incorrect nuw flag from test (NFC) nuw does not make sense for reverse iteration.	2022-05-10 12:17:09 +02:00
David Sherwood	45f2e92d97	[NFC][LoopVectorize] Add SVE test for tail-folding combined with interleaving Differential Revision: https://reviews.llvm.org/D125001	2022-05-09 13:08:25 +01:00
Simon Pilgrim	cbfa857346	[CostModel][X86] Adjust 128-bit select costs to account for slow BLENDV op Based off the script from D103695 - Jaguar, Bulldozer, Silvermont (et al) and Haswell all have slow BLENDV ops, so adjust the worse case cost values	2022-05-06 13:07:34 +01:00
Florian Hahn	ff8d0b338f	[VPlan] Add test for printing plan with an exit value. Test for printing plan with additions from D123537.	2022-05-04 17:19:02 +01:00
Igor Kirillov	4e5e042d9a	[LoopVectorize] Support reductions that store intermediary result Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA form due to lack of aliasing info. Runtime checks are generated to ensure the store does not alias any other accesses in the loop. Ordered fadd reductions are not yet supported. Differential Revision: https://reviews.llvm.org/D110235	2022-05-03 10:12:30 +01:00
David Green	6f81903e89	[LV][SLP] Mark fptosi_sat as vectorizable This adds fptosi_sat and fptoui_sat to the list of trivially vectorizable functions, mainly so that the loop vectorizer can vectorize the instruction. Marking them as trivially vectorizable also allows them to be SLP vectorized, and Scalarized. The signature of a fptosi_sat requires two type overrides (@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first operand of the intrinsic as a overloaded (but not scalar) operand. Differential Revision: https://reviews.llvm.org/D124358	2022-05-03 09:32:34 +01:00
Florian Hahn	0ef8ca6d88	[VPlan] Do not create VPWidenCall recipes for scalar vector factors. 'Widen' recipe are only used when actual vector values are generated. Fix tryToWidenCall to do not create VPWidenCallRecipes for scalar vector factors. This was exposed by D123720, because the widened recipes are considered vector users. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D124718	2022-05-02 19:40:33 +01:00
David Green	c7d39fd61a	[LV][SLP] Add tests for vectorizing fptoi_sat intrinsics. NFC	2022-05-02 15:11:44 +01:00
Simon Pilgrim	cff0afc184	[LoopVectorize][X86] Regenerate invariant-store-vectorization.ll	2022-05-01 13:04:24 +01:00
Simon Pilgrim	c2964746e3	[CostModel][X86] Reduce cost of vector selects on SSE2/AVX1 targets Based off the script from D103695, we were exaggerating the cost of the OR(AND(X,M),AND(Y,~M)) expansion using instruction count instead of effective throughput	2022-05-01 09:32:14 +01:00
Florian Hahn	841fffa745	[LV] Add test for interleaving multiple iterations with call.	2022-04-30 20:43:22 +01:00
Bjorn Pettersson	2e14900db9	[test][NewPM] Use -passes=loop-vectorize instead of -loop-vectorize Update a bunch of loop-vectorize regression tests to use the new PM syntax (opt -passes=loop-vectorize) instead of the deprecated legacy PM syntax (opt -loop-vectorize).	2022-04-28 16:46:00 +02:00
Florian Hahn	bea69b232f	[VPlan] Initial modeling of middle block in VPlan. This patch extends the scope of VPlan to also include the exit (aka middle) block. For now, the exit block remains empty, but handling of exit values will subsequently be moved to VPlan, by adding recipes to model exit values in the exit block. As a first step, this will allow fixing #51366. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D123457	2022-04-20 19:34:41 +01:00
Florian Hahn	a65f2730d2	[VPlan] Expand induction step in VPlan pre-header. This patch moves SCEV expansion of steps used by VPWidenIntOrFpInductionRecipes to the pre-header using VPExpandSCEVRecipe. This ensures that those steps are expanded while the CFG is in a valid state. Previously, SCEV expansion may happen during vector body code-generation, during which the CFG may be invalid, causing issues with SCEV expansion. Depends on D122095. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D122096	2022-04-19 13:06:39 +02:00
Craig Topper	ac8c720d48	[IR] Allow constant folding (insertelement <vscale x 2 x i32> zeroinitializer, i32 0, i32 i32 0. Most of insertelement constant folding is blocked if the vector type is scalable. I believe we can make an exception for inserting null into an all zeros vector. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123413	2022-04-15 17:44:32 -07:00
Florian Hahn	73f5d7d0d6	[VPlan] Handle equal address and store ops in onlyFirstLaneDemanded. With opaque pointers, the stored value and address can be the same. Previously the code in VPWidenMemoryInstructionRecipe::onlyFirstLaneDemanded incorrectly considers stores with matching store and pointer operands as only demanding the first lane, causing a crash.	2022-04-15 22:53:33 +02:00
Muhammad Omair Javaid	42ebfa8269	Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth" This reverts commit `64b6192e81`. This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage: https://lab.llvm.org/buildbot/#/builders/176/builds/1515 llvm-tblgen crashes after applying this patch.	2022-04-13 04:53:07 +05:00
Simon Pilgrim	431e93f4f5	[InstCombine] Fold sub(add(x,y),min/max(x,y)) -> max/min(x,y) (PR38280) As discussed on Issue #37628, we can flip a min/max node if we're subtracting from the sum of the node's operands Alive2: https://alive2.llvm.org/ce/z/W_KXfy Differential Revision: https://reviews.llvm.org/D123399	2022-04-11 11:32:56 +01:00
Florian Hahn	5f1eb74850	[VPlan] Place VPExpandSCEVRecipe in pre-header. After D121624 models the pre-header in VPlan, VPExpandSCEVRecipes can be placed there. This ensures SCEV expansion happens before modifying the CFG during VPlan execution, when CFG is incomplete. Depends on D121624. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D122095	2022-04-10 10:26:20 +02:00
Florian Hahn	256c6b0ba1	[VPlan] Model pre-header explicitly. This patch extends the scope of VPlan to also model the pre-header. The pre-header can be used to place recipes that should be code-gen'd outside the loop, like SCEV expansion. Depends on D121623. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121624	2022-04-09 14:19:47 +02:00
Simon Pilgrim	450f0d76b4	[LoopVectorize] Regenerate first-order-recurrence.ll	2022-04-09 10:33:03 +01:00
Stanislav Mekhanoshin	fced87d457	[AMDGPU] Fix regression with vectorization limiting D67148 has removed TTI::getNumberOfRegisters(bool Vector) and started to call TTI::getNumberOfRegisters(unsigned ClassID) from the LoopVectorize. This has resulted in an unrestricted vectorization on AMDGPU blowing up register pressure. Differential Revision: https://reviews.llvm.org/D122850	2022-04-08 17:46:49 -07:00
Florian Hahn	467dbcd9f1	[LV] Set debug loc after setting insert point. This fixes the code to actually use the location of the instruction, if available. Previously, SetInsertPoint would overwrite the insert point set from the instruction.	2022-04-08 20:34:40 +02:00
Florian Hahn	4c0d5db9c9	[LV] Add test case for wrong debug location with replicate recipe.	2022-04-08 20:34:16 +02:00
Florian Hahn	29fe998eaa	[VPlan] Preserve debug location when creating branch. Update createEmptyBasicBlock to preserve the debug location of the previous terminator.	2022-04-08 17:22:53 +02:00
Florian Hahn	547567fe2b	[LV] Add test for missing debug info on branch in vector loop. Adds a test case where currently no debug location is added to branches in the vector body.	2022-04-08 17:22:53 +02:00
Florian Hahn	631016a853	[LV] Add test case for PR54427. Reduced test for #54427.	2022-04-07 23:21:21 +02:00
Jingu Kang	64b6192e81	[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth Set the maximum VF of AArch64 with 128 / the size of smallest type in loop. Differential Revision: https://reviews.llvm.org/D118979	2022-04-05 13:16:52 +01:00
Florian Hahn	1ff022e21b	[LV] Add vector.body block to parent loop during skeleton creation. When creating induction resume values, SCEV queries may rely on LoopInfo. Make sure vector.body gets added to the loop of the pre-header during skeleton construction. %vector.body will be moved to the vector preheader during VPlan execution. Fixes #54745.	2022-04-05 11:54:17 +01:00
Florian Hahn	368d35a894	[LV] Add addiitonal tests for pointer difference memory checks. Additional tests for D119078.	2022-04-04 17:58:48 +01:00
Philip Reames	88de27e3fd	[LV] Handle non-integral types when considering interleave widening legality In general, anywhere we might need to insert a blind bitcast, we need to make sure the types are losslessly convertible. This fixes pr54634.	2022-04-03 20:16:20 -07:00
Dávid Bolvanský	872f7000fc	Revert "[NFCI] Regenerate SROA/LoopVectorize test checks" This reverts commit `14e3450fb5`.	2022-04-04 01:15:30 +02:00
Dávid Bolvanský	a113a582b1	[NFCI] Regenerate LoopVectorize test checks	2022-04-03 21:56:24 +02:00
Florian Hahn	95b2aa511e	[VPlan] Set VPlan header block name to vector.body. This brings the VPlan block naming in line with the naming of the generated basic blocks.	2022-04-02 19:34:32 +01:00
Florian Hahn	a08c90a402	[LV] Re-use TripCount from EPI.TripCount. During skeleton construction for the epilogue vector loop, generic helpers use getOrCreateTripCount, which will re-expand the trip count computation. Instead, re-use the TripCount created during main loop vectorization.	2022-04-01 13:47:34 +01:00
David Green	b65267ca7b	[LV] Invalidate widening decisions after maximizing vector bandwidth When MaximizeVectorBandwidth is enabled, we can end up (via calls to collectUniformsAndScalars/setCostBasedWideningDecision through calculateRegisterUsage) making widening decisions before we have decided whether to fold the tail by masking. These decisions will be wrong if we later decided to fold the tail, for example when the trip count is very low. It will use incorrect costs for loads that should get masked, using standard memory operation costs instead. This still at the moment uses the EmulatedMaskMemRefHack costs (a bit unfortunately), but the old costs without this change were 1, leading to too optimistic vectorization. This slightly changes the way that the MaximizeVectorBandwidth option works to make it easier to test, always honouring the option if it is set. Differential Revision: https://reviews.llvm.org/D120215	2022-03-31 09:19:31 +01:00
Florian Hahn	ecb4171dcb	[LV] Handle zero cost loops in selectInterleaveCount. In some case, like in the added test case, we can reach selectInterleaveCount with loops that actually have a cost of 0. Unfortunately a loop cost of 0 is also used to communicate that the cost has not been computed yet. To resolve the crash, bail out if the cost remains zero after computing it. This seems like the best option, as there are multiple code paths that return a cost of 0 to force a computation in selectInterleaveCount. Computing the cost at multiple places up front there would unnecessarily complicate the logic. Fixes #54413.	2022-03-29 22:52:43 +01:00
Florian Hahn	46432a0088	[VPlan] Add VPWidenPointerInductionRecipe. This patch moves pointer induction handling from VPWidenPHIRecipe to its own recipe. In the process, it adds all information required to generate code for pointer inductions without relying on Legal to access the list of induction phis. Alternatively VPWidenPHIRecipe could also take an optional pointer to InductionDescriptor. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121615	2022-03-24 14:58:45 +00:00
Florian Hahn	890fc21742	[LV] Extend checks in debugloc.ll.	2022-03-23 20:21:58 +00:00
Florian Hahn	973183612e	[VPlan] Add test for VPExpandSCEVRecipe printing.	2022-03-20 10:11:40 +00:00
Florian Hahn	d5fbcf76fd	[VPlan] Improve pattern in vplan-printing.ll check line. The existing pattern only matched a single value, which breaks if the numbering slightly changes.	2022-03-19 16:03:25 +00:00
Andrew Wei	0af3e6a22d	[InstCombine] Sink instructions with multiple users in a successor block. This patch tries to sink instructions when they are only used in a successor block. This is a further enhancement patch based on Anna's commit: D109700, which allows sinking an instruction having multiple uses in a single user. In this patch, sink instructions with multiple users in a single successor block will be supported. It could fix a known issue from rust: https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610 Reviewed By: nikic, reames Differential Revision: https://reviews.llvm.org/D121585	2022-03-18 11:53:45 +08:00
Florian Hahn	151c144350	[LV] Use usesScalars in widenPHIInstruction. This uses the existing VPlan helpers to check whether there are scalar uses of a phi recipe. It remove one of the few remaining dependencies on the cost model from VPlan code generation. Depends on D121612. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121613	2022-03-17 13:16:32 +00:00
Malhar Jajoo	a36d269658	[VPlan] Avoid collecting scalars for SVE This patch ensures scalars (except for uniforms) are no longer collected (prior to LVP planning phase) for scalable vectorization. This is to avoid the chances of generating scalarized instructions later (during LVP execute phase) as they are not supported for scalable vectorization. Relevant test has also been added. Differential Revision: https://reviews.llvm.org/D121452	2022-03-16 16:33:34 +00:00
Florian Hahn	5c4d64eb0d	[LV] Make reduction-order.ll test independent of instruction naming. Also update test to not use branch on undef.	2022-03-15 11:13:18 +00:00
Florian Hahn	4a0481e981	[LV] Check for users of truncated IVs, add more detailed comment. Add missing outside user check for truncated IVs. Also hoist the code in the helper with additional explanations. Fixes #54370.	2022-03-14 19:39:30 +00:00
Florian Hahn	1c0fc1f074	[VPlan] Ensure each iv user is only visited once in transform. If a recipe has multiple uses of an IV, we crash. It causes a crash when building llvm-test-suite. Exposed by `95f76bff1c`.	2022-03-13 21:42:17 +00:00
Florian Hahn	95f76bff1c	[LV] Create & use VPScalarIVSteps for all scalar users. This patch is a follow-up to D115953. It updates optimizeInductions to also introduce new VPScalarIVStepsRecipes if an IV has both vector and scalar uses. It updates all uses that only need scalar values to use the newly created recipe for the scalar steps. This completes untangling of VPWidenIntOrFpInductionRecipe code-generation. Now the recipe only creates the widened vector values, as it says on the tin. The code to genereate IR has been moved directly to VPWidenIntOrFpInductionRecipe::execute. Note that the recipe has been updated to hold a reference to ScalarEvolution, which is needed to expand the step, until we can place the corresponding SCEV expansion in the pre-header. Depends on D120827. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D120828	2022-03-13 17:15:24 +00:00
Sanjay Patel	b48fe158e0	[Analysis] remove bogus smin/smax pattern detection This is a revert of `cfcc42bdc`. The analysis is wrong as shown by the minimal tests for instcombine: https://alive2.llvm.org/ce/z/y9Dp8A There may be a way to salvage some of the other tests, but that can be done as follow-ups. This avoids a miscompile and fixes #54311.	2022-03-09 17:50:34 -05:00
Florian Hahn	a12403cfea	[LV] Do not consider instrs dead if used by phi that's not in plan. Single value phis won't be modeled in VPlan. If the phi only gets used outside the loop, the current code misses the fact that the incoming value is not dead. Update the code to also look through such phis to check for outside users. Fixes #54266	2022-03-09 16:04:44 +00:00
Florian Hahn	a2979c8399	[IVDescriptors] Bail out instead of asserting that order is expected. When dealing with multiple phis that depend on each other, the order might have been changed and may not match the expectation. If that happens, bail out, rather than asserting. Fixes https://github.com/llvm/llvm-project/issues/54218 Fixes https://github.com/llvm/llvm-project/issues/54233 Fixes https://github.com/llvm/llvm-project/issues/54254	2022-03-07 19:57:26 +00:00
Florian Hahn	f4368487aa	[LV] Add test from PR54227. Test from https://github.com/llvm/llvm-project/issues/54227. The underlying issue has already been fixed in `de8ac48` with a separate test.	2022-03-07 17:01:22 +00:00
Roman Lebedev	2f80ea7f4f	[NFC][LV] Use different braces in debug output The analysis passes output function name encapsulated in `'` braces, but LV uses `"`. Harmonizing this may help in creating an update script for the LV costmodel test checks. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D121105	2022-03-07 19:32:37 +03:00
Florian Hahn	de8ac485e5	[IVDescriptor] Remove SinkCandidate from SinkAfter before re-sinking. This ensures the right order in the sink-after map is maintained. If we re-sink an instruction, it must be sunk after all earlier instructions have been sunk. Fixes https://github.com/llvm/llvm-project/issues/54223	2022-03-05 19:48:26 +00:00
Florian Hahn	5a60260efe	[IVDescriptor] Use DT to check order of Previous, OtherPrev. Previous and OhterPrev may not be in the same block. Use DT::dominates instead of local comesBefore. DT::dominates is already used earlier to check the order of Previous and SinkCandidate. Fixes https://github.com/llvm/llvm-project/issues/54195	2022-03-04 11:07:42 +00:00
Florian Hahn	139215af8e	[IVDescriptor] Find original 'Previous' for first-order recurrences. This patch extends first-order recurrence handling to support cases where we already sunk an instruction for a different recurrence, but LastPrev comes before Previous. To handle those cases correctly, we need to find the earliest entry for the sink-after chain, because this is references the Previous from the original recurrence. This is needed to ensure we use the correct instruction as sink point. Depends on D118558. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D118642	2022-03-03 16:41:26 +00:00
Florian Hahn	8777cb66a8	[VPlan] Remove reliance on underlying instr for ScalarIVSteps (NFCI). Instead of relying on underlying instructions, this patch updates VPScalarIVStepsRecipe to only store the required type information. This removes access to unrelated information, as well as avoiding issues with the same underlying instruction being shared by multiple recipes. This change should only change the debug output and not cause any codegen changes, hence NFCI.	2022-03-02 16:23:19 +00:00
Florian Hahn	6dc456a375	[LV] Remove redundant check line from recurrence test. The removed line matches the previous line, modulo the check prefix. There is no way to disable sinking instructions as required due to first-order recurrence and removing the line should be safe.	2022-03-02 13:48:46 +00:00
Florian Hahn	83fd2071f0	[LV] Modernize test matching hardcoded induction phi name.	2022-03-02 10:12:38 +00:00
Florian Hahn	470b5c7f0d	[LV] Add test with multiple use of a FOR chained together. Additional test coverage for D118642.	2022-03-01 14:18:23 +00:00
Nikita Popov	26748bb15a	[InstCombine] Slightly relax one-use check in abs canonicalization Treat the icmp and sub symmetrically, and require that one of them has one use, not the icmp in particular. This could be further relaxed in the abs (but not nabs) case to not check one-use at all.	2022-03-01 15:06:41 +01:00
Nikita Popov	7c080e4649	[LoopVectorize] Regenerate test checks (NFC)	2022-03-01 15:01:14 +01:00
Andrei Elovikov	6e9a8cdcfb	[NFC][LoopVectorizer] Simplify LoopVectorize/X86/gather_scatter.ll The test used to run whole O3 pipeline. Modify it to contain LLVM IR right before LV and limit passes to "-loop-vectorizer -simplifycfg". For the RUN line with forced VF force interleave factor as well to simplify CHECKs as interleaving isn't related to the purpose of the test. I also tried to add "noalias" to pointer arguments in @test_gather_not_profitable_pr48429 but LAI seems unable to use them. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D119786	2022-02-28 11:12:50 -08:00
Florian Hahn	b3e8ace198	Recommit "[VPlan] Introduce recipe to build scalar steps." This reverts the revert commit `ff93260bf6`. The underlying issue causing the PPC bot failures has been fixed in `cbaac14734` and a corresponding test case has been added in `ad2cad1c52`. Original message: This patch adds a new VPScalarIVStepsRecipe to handle building scalar steps. In the first patch, it only handles the case where there is no vector induction variable needed. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D115953	2022-02-28 14:12:20 +00:00
Florian Hahn	cbaac14734	[LV] Remove induction recipes only used outside vector loop. Exit values of vector inductions are generated completely independent of the induction recipes. Consider them for removal, if they are not used in loop. This fixes a crash exposed by `49b23f451c`.	2022-02-28 11:14:22 +00:00
Florian Hahn	8bbc5e172a	[LV] Add test with dead induction in vector loop used outside. Add test with a induction phi that is not used in the vector loop, but by an lcssa phi in the loop exit.	2022-02-28 10:39:08 +00:00
Florian Hahn	ad2cad1c52	[LV] Add test with IV that needs scalar steps and user outside of loop. Also add a run line to check interleaving only. This test covers the PPC buildbot failures caused by `49b23f451c`.	2022-02-28 09:46:18 +00:00
Florian Hahn	ff93260bf6	Revert "[VPlan] Introduce recipe to build scalar steps." This reverts commit `49b23f451c`. This appears to break some PPC build bots. Revert while I investigate.	2022-02-27 17:51:19 +00:00
Florian Hahn	49b23f451c	[VPlan] Introduce recipe to build scalar steps. This patch adds a new VPScalarIVStepsRecipe to handle building scalar steps. In the first patch, it only handles the case where there is no vector induction variable needed. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D115953	2022-02-27 17:32:41 +00:00
Florian Hahn	da740492b0	[VPlan] Remove dead header-phi recipes. This patch adds a new transform to remove dead recipes. For now, it only removes dead recipes in the header, to keep the number tests that require updating manageable. Future patches will extend this to remove dead recipes across the whole plan. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D118051	2022-02-26 16:26:39 +00:00
Florian Hahn	462cd9270c	[LV] Add test with redundant cast in separate latch block. Adds another interesting test for D118051.	2022-02-26 14:52:55 +00:00
Nikita Popov	a266af7211	[InstCombine] Canonicalize SPF to min/max intrinsics Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max. Once this sticks, we can stop matching SPF min/max in various places, and can remove hacks we have for preventing infinite loops and breaking of SPF canonicalization. Differential Revision: https://reviews.llvm.org/D98152	2022-02-24 09:01:20 +01:00
Malhar Jajoo	9f1c6fbf11	[LAA] Add remarks for unbounded array access Adds new optimization remarks when loop vectorization fails due to the compiler being unable to find bound of an array access inside a loop Differential Revision: https://reviews.llvm.org/D115873	2022-02-23 15:57:39 +00:00
Kerry McLaughlin	12fb133eba	[LoopVectorize] Support conditional in-loop vector reductions Extends getReductionOpChain to look through Phis which may be part of the reduction chain. adjustRecipesForReductions will now also create a CondOp for VPReductionRecipe if the block is predicated and not only if foldTailByMasking is true. Changes were required in tryToBlend to ensure that we don't attempt to convert the reduction Phi into a select by returning a VPBlendRecipe. The VPReductionRecipe will create a select between the Phi and the reduction. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D117580	2022-02-22 12:04:35 +00:00
Florian Hahn	5c7ae10cec	[LV] Add store to test to make sure the loop is not dead. Add an extra store to the test, to make sure the operations in the loop cannot be optimized away after D118051.	2022-02-20 15:05:29 +00:00
zhongyunde	b2f5164deb	[IVDescriptors] Support FOR where we have multiple sink pointed Handles the case where Previous doesn't come before LastPrev incorrectly. Fix https://github.com/llvm/llvm-project/issues/53483 Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D118558	2022-02-14 09:30:35 +08:00
Florian Hahn	d462e64754	[LV] Drop noalias from check lines from test (NFC). The noalias metadata checks re not really relevant for the test and slight changes to metadata numbering can have large knock-on effects causing large noise in test diff.	2022-02-13 11:36:54 +00:00
Florian Hahn	446e7c64c7	[LV] Add real uses in some tests, to make them more robust. Add real uses to some tests, to ensure dead instructions cannot be directly removed.	2022-02-13 09:52:59 +00:00
Florian Hahn	9474c3009e	[LV] Move unrelated tests from first-order-recurrence-chains.ll	2022-02-11 09:15:42 +00:00
Florian Hahn	f97795121f	[LV] Add tests with chained first-order recurrences.	2022-02-10 15:55:19 +00:00
Simon Pilgrim	4517488eb7	[LoopVectorize] Regenerate reduction-predselect.ll test checks	2022-02-10 12:03:10 +00:00
David Green	b55d4c2ad8	Revert "[LV] Remove `LoopVectorizationCostModel::useEmulatedMaskMemRefHack()`" This reverts commit `77a0da926c` as we've received multiple reports of this significantly impacting performance, in ways that don't seem to just be target specific cost models going wrong. I would offer some reproducers, but the test changes here seem to be full of them! Reverting for now and hopefully we can remove the "hack" more carefully as we go.	2022-02-09 20:02:54 +00:00
David Green	b4c6d1bb37	[LoopVectorizer] Don't perform interleaving of predicated scalar loops The vectorizer will choose at times to "vectorize" loops with a scalar factor (VF=1) with interleaving (IC > 1). This can occasionally produce better code than the unroller (notable for reductions where it can produce independent reduction chains that are combined after the loop). At times this is not very beneficial though, for example when runtime checks are needed or when the scalar code requires predication. This addresses the second point, preventing the vectorizer from interleaving when the scalar loop will require predication. This prevents it from making a bit of a mess, that is worse than the original and better left for the unroller to unroll if beneficial. It helps reverse some of the regressions from D118090. Differential Revision: https://reviews.llvm.org/D118566	2022-02-07 19:34:28 +00:00
Florian Hahn	1049735d07	[LV] Adjust accesses in test to ensure full RT checks are generated. Add an additional access so the full runtime checks are still generated, even after D119078.	2022-02-07 18:07:19 +00:00
Roman Lebedev	77a0da926c	[LV] Remove `LoopVectorizationCostModel::useEmulatedMaskMemRefHack()` D43208 extracted `useEmulatedMaskMemRefHack()` from legality into cost model. What it essentially does is prevents scalarized vectorization of masked memory operations: ``` // TODO: Cost model for emulated masked load/store is completely // broken. This hack guides the cost model to use an artificially // high enough value to practically disable vectorization with such // operations, except where previously deployed legality hack allowed // using very low cost values. This is to avoid regressions coming simply // from moving "masked load/store" check from legality to cost model. // Masked Load/Gather emulation was previously never allowed. // Limited number of Masked Store/Scatter emulation was allowed. ``` While i don't really understand about what specifically `is completely broken` was talking about, i believe that at least on X86 with AVX2-or-later, this is no longer true. (or at least, i would like to know what is still broken). So i would like to follow suit after D111460, and like wise disable that hack for AVX2+. But since this was added for X86 specifically, let's just instead completely remove this hack. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D114779	2022-02-07 16:08:31 +03:00
Florian Hahn	ef4df27940	[LV] Modernize some runtime check tests a bit. Update tests to check runtime checks a bit more precisely.	2022-02-07 12:08:56 +00:00
Sander de Smalen	eaee477eda	[LV] Use VScaleForTuning to allow wider epilogue VFs. When the main loop is e.g. VF=vscale x 1 and the epilogue VF cannot be any smaller, the vectorizer should try to estimate how many lanes are executed at runtime and allow a suitable fixed-width VF to be chosen. It can use VScaleForTuning to figure out what a suitable fixed-width VF could be. For the case where the main loop VF is VF=vscale x 1, and VScaleForTuning=8, it could still choose an epilogue VF upto VF=4. This was a bit tricky to test, so this patch also introduces a wrapper function to get 'VScaleForTuning' by also considering vscale_range. If min and max are equal, then that will be the vscale we compile for. It makes little sense to tune for a different width if the code will not be portable for other widths. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D118709	2022-02-03 15:40:17 +00:00
Malhar Jajoo	778b455dd6	[LAA] Add Memory dependence remarks. Adds new optimization remarks when vectorization fails. More specifically, new remarks are added for following 4 cases: - Backward dependency - Backward dependency that prevents Store-to-load forwarding - Forward dependency that prevents Store-to-load forwarding - Unknown dependency It is important to note that only one of the sources of failures (to vectorize) is reported by the remarks. This source of failure may not be first in program order. A regression test has been added to test the following cases: a) Loop can be vectorized: No optimization remark is emitted b) Loop can not be vectorized: In this case an optimization remark will be emitted for one source of failure. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D108371	2022-02-02 12:07:51 +00:00
Sander de Smalen	2a44eaf20f	[LV] Allow a scalable VF for the epilogue. For some reason we limited the epilogue VF to be fixed-width, but there is not necessarily a reason for doing so. If the main VF=vscale x 16, the epilogue VF could be either fixed-width, or a scalable VF upto vscale x 8. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D118688	2022-02-01 22:38:55 +00:00
David Green	aaa16eb023	[LV][AArch64] Add test for scalar interleaving with predication. NFC	2022-02-01 09:21:49 +00:00
Florian Hahn	02ee3fbff8	[LV] Add additional complex first order recurrence test. Add a new test case with 2 first-order recurrences, which share a user.	2022-01-31 19:54:14 +00:00
Florian Hahn	8f12175fed	[VPlan] Use VPlan to check if only the first lane is used. This removes the remaining dependence on LoopVectorizationCostModel from buildScalarSteps and is required so it can be moved out of ILV. It also improves allows us to remove a few unneeded instructions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116554	2022-01-30 13:07:29 +00:00
Florian Hahn	efd4938723	[VPlan] Handle IV vector splat using VPWidenCanonicalIV. This patch tries to use an existing VPWidenCanonicalIVRecipe instead of creating another step-vector for canonical induction recipes in widenIntOrFpInduction. This has the following benefits: 1. First step to avoid setting both vector and scalar values for the same induction def. 2. Reducing complexity of widenIntOrFpInduction through making things more explicit in VPlan 3. Only need to splat the vector IV for block in masks. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116123	2022-01-29 16:25:27 +00:00
Malhar Jajoo	b75bdff4a0	Trivial update for debug location in LIT test. This just updates debug location of a loop in a LIT test to point to the correct source line.	2022-01-27 19:07:47 +00:00
Congzhe Cao	f3e1f44340	[IVDescriptor] Get the exact FP instruction that does not allow reordering This is a bugfix in IVDescriptor.cpp. The helper function `RecurrenceDescriptor::getExactFPMathInst()` is supposed to return the 1st FP instruction that does not allow reordering. However, when constructing the RecurrenceDescriptor, we trace the use-def chain staring from a PHI node and for each instruction in the use-def chain, its descriptor overrides the previous one. Therefore in the final RecurrenceDescriptor we constructed, we lose previous FP instructions that does not allow reordering. Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D118073	2022-01-27 00:33:46 -05:00
Igor Kirillov	d3932c690d	[LoopVectorize] Add tests with reductions that are stored in invariant address This patch adds tests for functionality that is to be implemented in D110235. Differential Revision: https://reviews.llvm.org/D117213	2022-01-24 21:26:38 +00:00
Florian Hahn	b2a8eff45c	[LV] Make some tests more robust by adding missing users.	2022-01-24 13:04:09 +00:00
Florian Hahn	b7f69b8d46	[LV] Name values and blocks in same induction tests (NFC). This reduces the churn in the test in future updates due to numbering changes.	2022-01-24 12:28:43 +00:00
Kerry McLaughlin	8082ab2fc3	[LoopVectorize] Support epilogue vectorisation of loops with reductions isCandidateForEpilogueVectorization will currently return false for loops which contain reductions. This patch removes this restriction and makes the following changes to support epilogue vectorisation with reductions: - `fixReduction`: If fixReduction is being called during vectorisation of the epilogue, the phi node it creates will need to additionally carry incoming values from the middle block of the main loop. - `createEpilogueVectorizedLoopSkeleton`: The incoming values of the phi created by fixReduction are updated after the vec.epilog.iter.check block is added. The phi is also moved to the preheader of the epilogue. - `processLoop`: The start value of any VPReductionPHIRecipes are updated before vectorising the epilogue loop. The getResumeInstr function added to the ILV will return the resume instruction associated with the recurrence descriptor. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D116928	2022-01-24 12:03:31 +00:00
eopXD	3cf15af2da	[RISCV] Remove experimental prefix from rvv-related extensions. Extensions affected: +v, +zve, +zvl Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117860	2022-01-22 20:18:40 -08:00
Kerry McLaughlin	c740a07863	[LoopVectorize] Test in-loop reductions with tail folding for scalable vectors Adds `-prefer-inloop-reductions` to the RUN line of sve-tail-folding.ll & adds a new test where in-loop reductions cannot be used (`@cond_xor_reduction`). NFC. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D117578	2022-01-19 14:36:23 +00:00
David Sherwood	e781620dee	[LoopVectorize][AArch64] Use get.active.lane.mask intrinsic when SVE is enabled When SVE is enabled for AArch64 targets it makes more sense to use the get.active.lane.mask intrinsic, because SVE has an exact 1-1 mapping from the intrinsic to the 'whilelo' instruction for legal vector types. This instruction neatly takes overflow into account as well. This patch fixes an issue in VPInstruction::generateInstruction that assumed we are only dealing with fixed-width vectors. Differential Revision: https://reviews.llvm.org/D117109	2022-01-18 11:59:30 +00:00
Florian Hahn	524150fe07	[LV] Add test coverage for reductions with odd interleave counts. Add test coverage for loops with reductions and odd (3, 5) interleave counts.	2022-01-17 14:34:21 +00:00
Florian Hahn	4a6f475446	[LV] Make test more robust by adding users of inductions. The modified tests didn't have actual users of all inductions, making it trivial to eliminate them. Add users to make sure the inductions are actually used in the vectorized version.	2022-01-17 13:28:59 +00:00
Kito Cheng	cc35161dc7	[RISCV] Add initial support for getRegUsageForType and getNumberOfRegisters Those two TTI hooks are used during vectorization for calculating register pressure, the default implementation isn't consider for LMUL, and that's also definitly wrong value for register number (all register class are 8 registers). So in this patch we tried to: 1. Calculate right register usage for vector type and scalar type. 2. Return right number of register for general purpose register and vector register. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116890	2022-01-17 15:27:54 +08:00
Florian Hahn	070d1034da	[LV] Restore metadata to disable runtime unrolling for epilogue loop. After `d4a8fc3a87` LV stopped adding metadata to disable runtime unrolling to the vectorized epilogue loop. This was missed because `278aa65cc4` removed the relevant test coverage. This patch fixes that by adding the relevant metadata after vector loop generation.	2022-01-16 13:14:16 +00:00
Florian Hahn	ba3198cfd1	[IRBuilder] Migrate select-folding to value-based FoldSelect. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D117228	2022-01-15 11:26:44 +00:00
Florian Hahn	42b34facfd	Recommit "[LV] Inline CreateSplatIV call for scalar VFs." This reverts the revert commit `073c27b5e5`. A reduced test case has been added in `5e4966cbae` and the code has been updated to handle the case where getInductionOpcode returns BinaryOpsEnd. In this case, the original code was always using Instruction::Add. Do the same in the patch. Note this commit may slightly change the value naming, because it now also assigns the 'induction' name in the floating point case.	2022-01-14 19:03:49 +00:00
Florian Hahn	5e4966cbae	[LV] Add test with an integer induction based on a ptr one. Reduced test case from the reproducer mentioned in `073c27b5e5`.	2022-01-14 15:56:47 +00:00
James Y Knight	073c27b5e5	Revert "[LV] Inline CreateSplatIV call for scalar VFs (NFC)." Causes a crash with the following (creduce'd) test-case: clang -O3 '--target=aarch64-grtev4-linux-gnu' -xc - -c -o /dev/null <<EOF int e; int f; int g() { int h; int j = 0; while (&f - j > 0) { int k; k = j; if (e == j && *e) k = 5; h = k; j++; } return h; } EOF This reverts commit `7ce48be0fd`.	2022-01-14 00:00:02 +00:00
Florian Hahn	7b9f5cbfa7	[LV] Extend check lines for pr34681.ll to cover foldable select.	2022-01-13 16:42:47 +00:00
Florian Hahn	3f2fb767e3	[VPlan] Make IV operand explicit for VPWidenCanonicalIVRecipe (NFC). This makes the def-use relationship between VPCanonicalIVPHIRecipe and VPWidenCanonicalIVRecipe explicit. Needed for D117140.	2022-01-13 11:13:05 +00:00
Florian Hahn	7ce48be0fd	[LV] Inline CreateSplatIV call for scalar VFs (NFC). This is a NFC change split off from D116123, as suggested there. D116123 will remove the last user of CreateSplatIV.	2022-01-13 09:34:31 +00:00
Florian Hahn	d4a8fc3a87	[VPlan] Introduce and use BranchOnCount VPInstruction. This patch adds a new BranchOnCount VPInstruction opcode with 2 operands. It first compares its 2 operands (increment of canonical induction and vector trip count), followed by a branch to either the exit block or back to the vector header. It must be the last recipe in the exit block of the topmost vector loop region. This extracts parts from D113224 and was discussed in D113223. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116479	2022-01-12 13:42:13 +00:00
Rosie Sumpter	552eb372cb	[LoopVectorize] Pass a vector type to isLegalMaskedGather/Scatter This is required to query the legality more precisely in the LoopVectorizer. This adds another TTI function named 'forceScalarizeMaskedGather/Scatter' function to work around the hack introduced for MVE, where isLegalMaskedGather/Scatter would return an answer by second-guessing where the function was called from, based on the Type passed in (vector vs scalar). The new interface makes this explicit. It is also used by X86 to check for vector widths where gather/scatters aren't profitable (or don't exist) for certain subtargets. Differential Revision: https://reviews.llvm.org/D115329	2022-01-12 13:34:12 +00:00
Florian Hahn	138fcc5f76	[IRBuilder] Migrate icmp-folding to value-based FoldICmp. Depends on D116935. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D116969	2022-01-12 12:37:46 +00:00
Florian Hahn	7e68061305	[IRBuilder] Migrate add-folding to value-based FoldAdd. Depends on D116935. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D116968	2022-01-12 09:24:46 +00:00
Florian Hahn	f0ef1ea6dd	[IRBuilder] Introduce folder using inst-simplify, use for Or fold. Alternative to D116817. This introduces a new value-based folding interface for Or (FoldOr), which takes 2 values and returns an existing Value or a constant if the Or can be simplified. Otherwise nullptr is returned. This replaces the more restrictive CreateOr which takes 2 constants. This is the used to implement a folder that uses InstructionSimplify. The logic to simplify `Or` instructions is moved there. Subsequent patches are going to transition other CreateXXX to the more general FoldXXX interface. Reviewed By: nikic, lebedev.ri Differential Revision: https://reviews.llvm.org/D116935	2022-01-11 17:30:48 +00:00
David Sherwood	b0922a9dcd	[LoopVectorize] Make VPWidenCanonicalIVRecipe::execute work for scalable vectors The code in VPWidenCanonicalIVRecipe::execute only worked for fixed-width vectors due to the way we generate the values per lane. This patch changes the code to use a combination of vector splats and step vectors to get the same result. This then works for both fixed-width and scalable vectors. Tests that exercise this code path for scalable vectors have been added here: Transforms/LoopVectorize/AArch64/sve-tail-folding.ll Differential Revision: https://reviews.llvm.org/D113180	2022-01-10 14:12:32 +00:00
Florian Hahn	aecad5828e	[SCEVExpander] Only create trunc when needed. `9345ab3a45` updated generateOverflowCheck to skip creating checks that always evaluate to false. This in turn means that we only need to create TruncTripCount if it is actually used. Sink the TruncTripCount creating into ComputeEndCheck, so it is only created when there's an actual check.	2022-01-10 11:31:27 +00:00
David Sherwood	e3c84fb948	[LoopVectorize] Add support for tail folding using scalable vectors This patch fixes up an issue with InnerLoopVectorizer::getOrCreateVectorTripCount whereby we weren't correctly generating the runtime trip count for scalable vectors when tail-folding. It also removes some asserts in the tail-folding path for cases when the VF is not scalable. In this patch I have only permitted tail-folding to be enabled explicitly for scalable vectors when the user has specified one of the following flags: -prefer-predicate-over-epilogue=predicate-dont-vectorize -prefer-predicate-over-epilogue=predicate-else-scalar-epilogue For now it's best not to enable tail-folding with scalable vectors for low trip counts or when optimising for code size, since there has been no analysis on whether this is worth it. Various tests have been added here: Transforms/LoopVectorize/AArch64/sve-tail-folding.ll Transforms/LoopVectorize/AArch64/sve-tail-folding-forced.ll The tests cannot be target independent because they require masked load/store support, i.e. TTI.isLegalMaskedLoad and TTI.isLegalMaskedStore need to return true. Differential Revision: https://reviews.llvm.org/D113003	2022-01-10 10:55:40 +00:00
Florian Hahn	7f1bf68d7d	[SCEVExpander] Only check overflow if it is needed. `9345ab3a45` updated generateOverflowCheck to skip creating checks that always evaluate to false. This in turn means that we only need to check for overflows if the result of the multiplication is actually used. Sink the Or for the overflow check into ComputeEndCheck, so it is only created when there's an actual check.	2022-01-09 12:55:41 +00:00
Florian Hahn	3b7b1a75b0	[LV] Improve check lines in existing tests. Update the check lines in 2 existing tests to use patterns + variables to match some IR to make them independent of value naming.	2022-01-08 20:46:31 +00:00
Florian Hahn	daa5e26312	[LV] Make tests more robust by removing undef. Replace some uses of undef in the tests. The undef causes runtime checks to be trivially fold/removeable, which does defeat the purpose of the tests.	2022-01-08 15:21:57 +00:00
Florian Hahn	9345ab3a45	[SCEVExpander] Skip creating <u 0 check, which is always false. Unsigned compares of the form <u 0 are always false. Do not create such a redundant check in generateOverflowCheck. The patch introduces a new lambda to create the check, so we can exit early conveniently and skip creating some instructions feeding the check. I am planning to sink a few additional instructions as follow-ups, but I would prefer to do this separately, to keep the changes and diff smaller. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D116811	2022-01-08 10:31:04 +00:00
Craig Topper	042394b69e	[RISCV] Add a command line option to control the LMUL used by TTI's getRegisterBitWidth. By default we return the width of an LMUL=1 register. We can enable testing with larger LMUL values by returning a larger bit width. This patch adds a RISCV specific option to provide a LMUL which will be multiplied by the LMUL=1 bit width. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D116339	2022-01-07 20:02:10 -08:00
David Green	bc615e436c	[AArch64] Update addo and subo costs Similar to D116732, this adds basic scalar sadd_with_overflow, uadd_with_overflow, ssub_with_overflow and usub_with_overflow costs for aarch64, which are usually quite efficiently lowered. Differential Revision: https://reviews.llvm.org/D116734	2022-01-07 16:20:23 +00:00
Florian Hahn	f395a4f8d5	[SCEVExpand] Only create required predicate checks. Currently generateOverflowCheck always creates code for Step being negative and positive, followed by a select at the end depending on Step's sign. This patch updates the code to only create either the checks for step being positive or negative, if the sign is known. Follow-up to D116696. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D116747	2022-01-07 14:49:02 +00:00
Florian Hahn	86d113a8b8	[SCEVExpand] Do not create redundant 'or false' for pred expansion. This patch updates SCEVExpander::expandUnionPredicate to not create redundant 'or false, x' instructions. While those are trivially foldable, they can be easily avoided and hinder code that checks the size/cost of the generated checks before further folds. I am planning on look into a few other similar improvements to code generated by SCEVExpander. I remember a while ago @lebedev.ri working on doing some trivial folds like that in IRBuilder itself, but there where concerns that such changes may subtly break existing code. Reviewed By: reames, lebedev.ri Differential Revision: https://reviews.llvm.org/D116696	2022-01-06 11:52:19 +00:00
Sander de Smalen	95a93722db	[LV] Remove what seems like stale code in collectElementTypesForWidening. This was originally added in rG22174f5d5af1eb15b376c6d49e7925cbb7cca6be although that patch doesn't really mention any reasons for ignoring the pointer type in this calculation if the memory access isn't consecutive. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D115356	2022-01-05 12:20:59 +00:00
Florian Hahn	65c4d6191f	[VPlan] Add VPCanonicalIVPHIRecipe, partly retire createInductionVariable. At the moment, the primary induction variable for the vector loop is created as part of the skeleton creation. This is tied to creating the vector loop latch outside of VPlan. This prevents from modeling the whole vector loop in VPlan, which in turn is required to model preheader and exit blocks in VPlan as well. This patch introduces a new recipe VPCanonicalIVPHIRecipe to represent the primary IV in VPlan and CanonicalIVIncrement{NUW} opcodes for VPInstruction to model the increment. This allows us to partly retire createInductionVariable. At the moment, a bit of patching up is done after executing all blocks in the plan. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D113223	2022-01-05 10:46:06 +00:00
Rosie Sumpter	961f51fdf0	[LoopVectorize][CostModel] Choose smaller VFs for in-loop reductions without loads/stores For loops that contain in-loop reductions but no loads or stores, large VFs are chosen because LoopVectorizationCostModel::getSmallestAndWidestTypes has no element types to check through and so returns the default widths (-1U for the smallest and 8 for the widest). This results in the widest VF being chosen for the following example, float s = 0; for (int i = 0; i < N; ++i) s += (float) i*i; which, for more computationally intensive loops, leads to large loop sizes when the operations end up being scalarized. In this patch, for the case where ElementTypesInLoop is empty, the widest type is determined by finding the smallest type used by recurrences in the loop instead of falling back to a default value of 8 bits. This results in the cost model choosing a more sensible VF for loops like the one above. Differential Revision: https://reviews.llvm.org/D113973	2022-01-04 10:12:57 +00:00
Florian Hahn	b1a333f0fe	[VPlan] Don't consider VPWidenCanonicalIVRecipe phi-like. VPWidenCanonicalIVRecipe does not create PHI instructions, so it does not need to be placed in the phi section of a VPBasicBlock. Also tidies the code so the WidenCanonicalIV recipe and the compare/lane-masks are created in the header. Discussed D113223. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116473	2022-01-02 12:48:17 +00:00
Sanjay Patel	0c6979b2d6	[InstCombine] fold opposite shifts around an add ((X << C) + Y) >>u C --> (X + (Y >>u C)) & (-1 >>u C) https://alive2.llvm.org/ce/z/DY9DPg This replaces a shift with an 'and', and in the case where the add has a constant operand, it eliminates both shifts. As noted in the TODO comment, we already have this fold when the shifts are in the opposite order (and that code handles bitwise logic ops too). Fixes #52851	2021-12-30 12:01:06 -05:00
Sanjay Patel	fd9cd3408b	Revert "[InstCombine] fold opposite shifts around an add" This reverts commit `2e3e0a5c28`. Some unintended diffs snuck into this patch.	2021-12-30 11:54:55 -05:00
Sanjay Patel	2e3e0a5c28	[InstCombine] fold opposite shifts around an add ((X << C) + Y) >>u C --> (X + (Y >>u C)) & (-1 >>u C) https://alive2.llvm.org/ce/z/DY9DPg This replaces a shift with an 'and', and in the case where the add has a constant operand, it eliminates both shifts. As noted in the TODO comment, we already have this fold when the shifts are in the opposite order (and that code handles bitwise logic ops too). Fixes #52851	2021-12-30 11:52:29 -05:00
Craig Topper	a9486a40f7	[RISCV] Disable interleaving scalar loops in the loop vectorizer. The loop vectorizer can interleave scalar loops even if it doesn't vectorize them. I don't believe we intended to enable this when we enabled interleaving for vector instructions. Disable interleaving for VF=1 like X86 and AMDGPU already do. Test lifted from AMDGPU. Differential Revision: https://reviews.llvm.org/D115975	2021-12-23 08:37:24 -06:00
Florian Hahn	ede7c2438f	[VPlan] Create header & latch blocks for skeleton up front (NFC). By creating the header and latch blocks up front and adding blocks and recipes in between those 2 blocks we ensure that the entry and exits of the plan remain valid throughout construction. In order to avoid test changes and keep printing of the plans the same, we use the new header block instead of creating a new block on the first iteration of the loop traversing the original loop. We also fold the latch into its predecessor. This is a follow up to a post-commit suggestion in D114586. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D115793	2021-12-22 12:44:25 +00:00
Sander de Smalen	290ae657a6	Fix buildbot failure caused by D115651 I somehow missed updating the RUN line of this test.	2021-12-20 17:18:59 +00:00
Sander de Smalen	b1ff20fd35	[LV] Enable scalable vectorization by default for SVE cores. The availability of SVE should be sufficient to enable scalable auto-vectorization. This patch adds a new TTI interface to query the target what style of vectorization it wants when scalable vectors are available. For other targets than AArch64, this currently defaults to 'FixedWidthOnly'. Differential Revision: https://reviews.llvm.org/D115651	2021-12-20 16:23:29 +00:00
Florian Hahn	5b362e4c7f	[VPlan] Add Debugloc to VPInstruction. Upcoming changes require attaching debug locations to VPInstructions, e.g. adding induction increment recipes in D113223. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D115123	2021-12-20 15:10:41 +00:00
Philip Reames	e6ad9ef4e7	[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transforms are inconsistent about which types we end up with based on visit order. I'm restricting this to constants as for non-constants, we'd have to decide whether the simplicity was worth extra instructions. For constants, there are no extra instructions. We chose the canonical type as i64 arbitrarily. We might consider changing this to something else in the future if we have cause. Differential Revision: https://reviews.llvm.org/D115387	2021-12-13 16:56:22 -08:00
Philip Reames	eb052f6b8f	Reapply: Autogen more vectorizer tests in advance of D115387. Drop changes to consecutive-ptr-uniforms.ll since that test checks boths IR output and debug messages. I'd missed this in the original commit, and Florian pointed it out in post-commit review. Original commit message: These are the ones my first round of scripting couldn't handle that required a bit of manual messaging. This should be the last batch in llvm-check. This reverts commit `bbba86764a`.	2021-12-13 15:49:14 -08:00
Philip Reames	bbba86764a	Revert "Autogen more vectorizer tests in advance of D115387." This reverts commit `bbfaf0b170`. Post commit review noted a case where my manual update lost intentional check lines. Given I've abandoned the motivating patch, I'm just reverting the autogen prep.	2021-12-13 12:45:50 -08:00
Philip Reames	bbfaf0b170	Autogen more vectorizer tests in advance of D115387. These are the ones my first round of scripting couldn't handle that required a bit of manual messaging. This should be the last batch in llvm-check.	2021-12-13 11:04:20 -08:00
Philip Reames	1a18de3d0a	Autogen a bunch of instcombine and vectorizer tests Done in advance of D115387. These are all the ones which my local script could handle, there's a couple more which need manual updates.	2021-12-13 10:41:38 -08:00
Florian Hahn	e2885c7c9b	[VPlan] Add printing test with VPInstruction with debug locs. Test case for D113223.	2021-12-13 13:08:41 +00:00

... 2 3 4 5 6 ...

1862 Commits