llvm-project

Commit Graph

Author	SHA1	Message	Date
Florian Hahn	635b752211	[VPlan] VPInterleaveRecipe only uses first lane if op not stored. With opaque pointers, both the stored value and the address can be the same. Only consider the recipe using the first lane only if the address is not stored. Fixes #55375.	2022-05-11 11:24:56 +01:00
Florian Hahn	e79c1962b9	[LV] Add opaque pointer test for #55375 .	2022-05-11 11:24:52 +01:00
Nikita Popov	ff20ee32d8	[LoopVectorize] Remove incorrect nuw flag from test (NFC) nuw does not make sense for reverse iteration.	2022-05-10 12:17:09 +02:00
David Sherwood	45f2e92d97	[NFC][LoopVectorize] Add SVE test for tail-folding combined with interleaving Differential Revision: https://reviews.llvm.org/D125001	2022-05-09 13:08:25 +01:00
Simon Pilgrim	cbfa857346	[CostModel][X86] Adjust 128-bit select costs to account for slow BLENDV op Based off the script from D103695 - Jaguar, Bulldozer, Silvermont (et al) and Haswell all have slow BLENDV ops, so adjust the worse case cost values	2022-05-06 13:07:34 +01:00
Florian Hahn	ff8d0b338f	[VPlan] Add test for printing plan with an exit value. Test for printing plan with additions from D123537.	2022-05-04 17:19:02 +01:00
Igor Kirillov	4e5e042d9a	[LoopVectorize] Support reductions that store intermediary result Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA form due to lack of aliasing info. Runtime checks are generated to ensure the store does not alias any other accesses in the loop. Ordered fadd reductions are not yet supported. Differential Revision: https://reviews.llvm.org/D110235	2022-05-03 10:12:30 +01:00
David Green	6f81903e89	[LV][SLP] Mark fptosi_sat as vectorizable This adds fptosi_sat and fptoui_sat to the list of trivially vectorizable functions, mainly so that the loop vectorizer can vectorize the instruction. Marking them as trivially vectorizable also allows them to be SLP vectorized, and Scalarized. The signature of a fptosi_sat requires two type overrides (@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first operand of the intrinsic as a overloaded (but not scalar) operand. Differential Revision: https://reviews.llvm.org/D124358	2022-05-03 09:32:34 +01:00
Florian Hahn	0ef8ca6d88	[VPlan] Do not create VPWidenCall recipes for scalar vector factors. 'Widen' recipe are only used when actual vector values are generated. Fix tryToWidenCall to do not create VPWidenCallRecipes for scalar vector factors. This was exposed by D123720, because the widened recipes are considered vector users. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D124718	2022-05-02 19:40:33 +01:00
David Green	c7d39fd61a	[LV][SLP] Add tests for vectorizing fptoi_sat intrinsics. NFC	2022-05-02 15:11:44 +01:00
Simon Pilgrim	cff0afc184	[LoopVectorize][X86] Regenerate invariant-store-vectorization.ll	2022-05-01 13:04:24 +01:00
Simon Pilgrim	c2964746e3	[CostModel][X86] Reduce cost of vector selects on SSE2/AVX1 targets Based off the script from D103695, we were exaggerating the cost of the OR(AND(X,M),AND(Y,~M)) expansion using instruction count instead of effective throughput	2022-05-01 09:32:14 +01:00
Florian Hahn	841fffa745	[LV] Add test for interleaving multiple iterations with call.	2022-04-30 20:43:22 +01:00
Bjorn Pettersson	2e14900db9	[test][NewPM] Use -passes=loop-vectorize instead of -loop-vectorize Update a bunch of loop-vectorize regression tests to use the new PM syntax (opt -passes=loop-vectorize) instead of the deprecated legacy PM syntax (opt -loop-vectorize).	2022-04-28 16:46:00 +02:00
Florian Hahn	bea69b232f	[VPlan] Initial modeling of middle block in VPlan. This patch extends the scope of VPlan to also include the exit (aka middle) block. For now, the exit block remains empty, but handling of exit values will subsequently be moved to VPlan, by adding recipes to model exit values in the exit block. As a first step, this will allow fixing #51366. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D123457	2022-04-20 19:34:41 +01:00
Florian Hahn	a65f2730d2	[VPlan] Expand induction step in VPlan pre-header. This patch moves SCEV expansion of steps used by VPWidenIntOrFpInductionRecipes to the pre-header using VPExpandSCEVRecipe. This ensures that those steps are expanded while the CFG is in a valid state. Previously, SCEV expansion may happen during vector body code-generation, during which the CFG may be invalid, causing issues with SCEV expansion. Depends on D122095. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D122096	2022-04-19 13:06:39 +02:00
Craig Topper	ac8c720d48	[IR] Allow constant folding (insertelement <vscale x 2 x i32> zeroinitializer, i32 0, i32 i32 0. Most of insertelement constant folding is blocked if the vector type is scalable. I believe we can make an exception for inserting null into an all zeros vector. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D123413	2022-04-15 17:44:32 -07:00
Florian Hahn	73f5d7d0d6	[VPlan] Handle equal address and store ops in onlyFirstLaneDemanded. With opaque pointers, the stored value and address can be the same. Previously the code in VPWidenMemoryInstructionRecipe::onlyFirstLaneDemanded incorrectly considers stores with matching store and pointer operands as only demanding the first lane, causing a crash.	2022-04-15 22:53:33 +02:00
Muhammad Omair Javaid	42ebfa8269	Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth" This reverts commit `64b6192e81`. This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage: https://lab.llvm.org/buildbot/#/builders/176/builds/1515 llvm-tblgen crashes after applying this patch.	2022-04-13 04:53:07 +05:00
Simon Pilgrim	431e93f4f5	[InstCombine] Fold sub(add(x,y),min/max(x,y)) -> max/min(x,y) (PR38280) As discussed on Issue #37628, we can flip a min/max node if we're subtracting from the sum of the node's operands Alive2: https://alive2.llvm.org/ce/z/W_KXfy Differential Revision: https://reviews.llvm.org/D123399	2022-04-11 11:32:56 +01:00
Florian Hahn	5f1eb74850	[VPlan] Place VPExpandSCEVRecipe in pre-header. After D121624 models the pre-header in VPlan, VPExpandSCEVRecipes can be placed there. This ensures SCEV expansion happens before modifying the CFG during VPlan execution, when CFG is incomplete. Depends on D121624. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D122095	2022-04-10 10:26:20 +02:00
Florian Hahn	256c6b0ba1	[VPlan] Model pre-header explicitly. This patch extends the scope of VPlan to also model the pre-header. The pre-header can be used to place recipes that should be code-gen'd outside the loop, like SCEV expansion. Depends on D121623. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121624	2022-04-09 14:19:47 +02:00
Simon Pilgrim	450f0d76b4	[LoopVectorize] Regenerate first-order-recurrence.ll	2022-04-09 10:33:03 +01:00
Stanislav Mekhanoshin	fced87d457	[AMDGPU] Fix regression with vectorization limiting D67148 has removed TTI::getNumberOfRegisters(bool Vector) and started to call TTI::getNumberOfRegisters(unsigned ClassID) from the LoopVectorize. This has resulted in an unrestricted vectorization on AMDGPU blowing up register pressure. Differential Revision: https://reviews.llvm.org/D122850	2022-04-08 17:46:49 -07:00
Florian Hahn	467dbcd9f1	[LV] Set debug loc after setting insert point. This fixes the code to actually use the location of the instruction, if available. Previously, SetInsertPoint would overwrite the insert point set from the instruction.	2022-04-08 20:34:40 +02:00
Florian Hahn	4c0d5db9c9	[LV] Add test case for wrong debug location with replicate recipe.	2022-04-08 20:34:16 +02:00
Florian Hahn	29fe998eaa	[VPlan] Preserve debug location when creating branch. Update createEmptyBasicBlock to preserve the debug location of the previous terminator.	2022-04-08 17:22:53 +02:00
Florian Hahn	547567fe2b	[LV] Add test for missing debug info on branch in vector loop. Adds a test case where currently no debug location is added to branches in the vector body.	2022-04-08 17:22:53 +02:00
Florian Hahn	631016a853	[LV] Add test case for PR54427. Reduced test for #54427.	2022-04-07 23:21:21 +02:00
Jingu Kang	64b6192e81	[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth Set the maximum VF of AArch64 with 128 / the size of smallest type in loop. Differential Revision: https://reviews.llvm.org/D118979	2022-04-05 13:16:52 +01:00
Florian Hahn	1ff022e21b	[LV] Add vector.body block to parent loop during skeleton creation. When creating induction resume values, SCEV queries may rely on LoopInfo. Make sure vector.body gets added to the loop of the pre-header during skeleton construction. %vector.body will be moved to the vector preheader during VPlan execution. Fixes #54745.	2022-04-05 11:54:17 +01:00
Florian Hahn	368d35a894	[LV] Add addiitonal tests for pointer difference memory checks. Additional tests for D119078.	2022-04-04 17:58:48 +01:00
Philip Reames	88de27e3fd	[LV] Handle non-integral types when considering interleave widening legality In general, anywhere we might need to insert a blind bitcast, we need to make sure the types are losslessly convertible. This fixes pr54634.	2022-04-03 20:16:20 -07:00
Dávid Bolvanský	872f7000fc	Revert "[NFCI] Regenerate SROA/LoopVectorize test checks" This reverts commit `14e3450fb5`.	2022-04-04 01:15:30 +02:00
Dávid Bolvanský	a113a582b1	[NFCI] Regenerate LoopVectorize test checks	2022-04-03 21:56:24 +02:00
Florian Hahn	95b2aa511e	[VPlan] Set VPlan header block name to vector.body. This brings the VPlan block naming in line with the naming of the generated basic blocks.	2022-04-02 19:34:32 +01:00
Florian Hahn	a08c90a402	[LV] Re-use TripCount from EPI.TripCount. During skeleton construction for the epilogue vector loop, generic helpers use getOrCreateTripCount, which will re-expand the trip count computation. Instead, re-use the TripCount created during main loop vectorization.	2022-04-01 13:47:34 +01:00
David Green	b65267ca7b	[LV] Invalidate widening decisions after maximizing vector bandwidth When MaximizeVectorBandwidth is enabled, we can end up (via calls to collectUniformsAndScalars/setCostBasedWideningDecision through calculateRegisterUsage) making widening decisions before we have decided whether to fold the tail by masking. These decisions will be wrong if we later decided to fold the tail, for example when the trip count is very low. It will use incorrect costs for loads that should get masked, using standard memory operation costs instead. This still at the moment uses the EmulatedMaskMemRefHack costs (a bit unfortunately), but the old costs without this change were 1, leading to too optimistic vectorization. This slightly changes the way that the MaximizeVectorBandwidth option works to make it easier to test, always honouring the option if it is set. Differential Revision: https://reviews.llvm.org/D120215	2022-03-31 09:19:31 +01:00
Florian Hahn	ecb4171dcb	[LV] Handle zero cost loops in selectInterleaveCount. In some case, like in the added test case, we can reach selectInterleaveCount with loops that actually have a cost of 0. Unfortunately a loop cost of 0 is also used to communicate that the cost has not been computed yet. To resolve the crash, bail out if the cost remains zero after computing it. This seems like the best option, as there are multiple code paths that return a cost of 0 to force a computation in selectInterleaveCount. Computing the cost at multiple places up front there would unnecessarily complicate the logic. Fixes #54413.	2022-03-29 22:52:43 +01:00
Florian Hahn	46432a0088	[VPlan] Add VPWidenPointerInductionRecipe. This patch moves pointer induction handling from VPWidenPHIRecipe to its own recipe. In the process, it adds all information required to generate code for pointer inductions without relying on Legal to access the list of induction phis. Alternatively VPWidenPHIRecipe could also take an optional pointer to InductionDescriptor. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121615	2022-03-24 14:58:45 +00:00
Florian Hahn	890fc21742	[LV] Extend checks in debugloc.ll.	2022-03-23 20:21:58 +00:00
Florian Hahn	973183612e	[VPlan] Add test for VPExpandSCEVRecipe printing.	2022-03-20 10:11:40 +00:00
Florian Hahn	d5fbcf76fd	[VPlan] Improve pattern in vplan-printing.ll check line. The existing pattern only matched a single value, which breaks if the numbering slightly changes.	2022-03-19 16:03:25 +00:00
Andrew Wei	0af3e6a22d	[InstCombine] Sink instructions with multiple users in a successor block. This patch tries to sink instructions when they are only used in a successor block. This is a further enhancement patch based on Anna's commit: D109700, which allows sinking an instruction having multiple uses in a single user. In this patch, sink instructions with multiple users in a single successor block will be supported. It could fix a known issue from rust: https://github.com/rust-lang/rust/issues/51346#issuecomment-394443610 Reviewed By: nikic, reames Differential Revision: https://reviews.llvm.org/D121585	2022-03-18 11:53:45 +08:00
Florian Hahn	151c144350	[LV] Use usesScalars in widenPHIInstruction. This uses the existing VPlan helpers to check whether there are scalar uses of a phi recipe. It remove one of the few remaining dependencies on the cost model from VPlan code generation. Depends on D121612. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D121613	2022-03-17 13:16:32 +00:00
Malhar Jajoo	a36d269658	[VPlan] Avoid collecting scalars for SVE This patch ensures scalars (except for uniforms) are no longer collected (prior to LVP planning phase) for scalable vectorization. This is to avoid the chances of generating scalarized instructions later (during LVP execute phase) as they are not supported for scalable vectorization. Relevant test has also been added. Differential Revision: https://reviews.llvm.org/D121452	2022-03-16 16:33:34 +00:00
Florian Hahn	5c4d64eb0d	[LV] Make reduction-order.ll test independent of instruction naming. Also update test to not use branch on undef.	2022-03-15 11:13:18 +00:00
Florian Hahn	4a0481e981	[LV] Check for users of truncated IVs, add more detailed comment. Add missing outside user check for truncated IVs. Also hoist the code in the helper with additional explanations. Fixes #54370.	2022-03-14 19:39:30 +00:00
Florian Hahn	1c0fc1f074	[VPlan] Ensure each iv user is only visited once in transform. If a recipe has multiple uses of an IV, we crash. It causes a crash when building llvm-test-suite. Exposed by `95f76bff1c`.	2022-03-13 21:42:17 +00:00
Florian Hahn	95f76bff1c	[LV] Create & use VPScalarIVSteps for all scalar users. This patch is a follow-up to D115953. It updates optimizeInductions to also introduce new VPScalarIVStepsRecipes if an IV has both vector and scalar uses. It updates all uses that only need scalar values to use the newly created recipe for the scalar steps. This completes untangling of VPWidenIntOrFpInductionRecipe code-generation. Now the recipe only creates the widened vector values, as it says on the tin. The code to genereate IR has been moved directly to VPWidenIntOrFpInductionRecipe::execute. Note that the recipe has been updated to hold a reference to ScalarEvolution, which is needed to expand the step, until we can place the corresponding SCEV expansion in the pre-header. Depends on D120827. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D120828	2022-03-13 17:15:24 +00:00

1 2 3 4 5 ...

1661 Commits