llvm-project

Commit Graph

Author	SHA1	Message	Date
Alexey Bataev	07015e12f0	[SLP]Fix PR59053: trying to erase instruction with users. Need to count the reduced values, vectorized in the tree but not in the top node. Such scalars still must be extracted out of the vector node instead of the original scalar.	2022-11-17 17:23:48 -08:00
Stanislav Mekhanoshin	bcaf31ec3f	[AMDGPU] Allow finer grain control of an unaligned access speed A target can return if a misaligned access is 'fast' as defined by the target or not. In reality there can be different levels of 'fast' and 'slow'. This patch changes the boolean 'Fast' argument of the allowsMisalignedMemoryAccesses family of functions to an unsigned representing its speed. A target can still define it as it wants and the direct translation of the current code uses 0 and 1 for current false and true. This makes the change an NFC. Subsequent patch will start using an actual value of speed in the load/store vectorizer to compare if a vectorized access going to be not just fast, but not slower than before. Differential Revision: https://reviews.llvm.org/D124217	2022-11-17 09:23:53 -08:00
Florian Hahn	55f56cdc33	[VPlan] Introduce VPValue::hasDefiningRecipe helper (NFC). This clarifies the intention of code that uses the helper. Suggested by @Ayal during review of D136068, thanks!	2022-11-16 23:12:40 +00:00
Florian Hahn	aa16689f82	[VPlan] Use recipe type to avoid getDefiningRecipe call (NFC). Suggested by @Ayal during review of D136068, thanks!	2022-11-16 23:03:34 +00:00
Florian Hahn	239b52d4b6	[VPlan] Update stale comment (NFC). Update comment to reflect current code, which also allows for VPScalarIVStepsRecipes to be uniform. Suggested by @Ayal during review of D136068, thanks!	2022-11-16 22:39:50 +00:00
Florian Hahn	bcc9c5d959	[LV] Replace unnecessary cast_or_null with cast (NFC). The existing code already unconditionally dereferences RepR, so cast_or_null can be replaced by just cast. Suggested by @Ayal during review of D136068, thanks!	2022-11-16 22:31:59 +00:00
Florian Hahn	32f1c5531b	[VPlan] Update VPValue::getDef to return VPRecipeBase, adjust name(NFC) The return value of getDef is guaranteed to be a VPRecipeBase and all users can also accept a VPRecipeBase *. Most users actually case to VPRecipeBase or a specific recipe before using it, so this change removes a number of redundant casts. Also rename it to getDefiningRecipe to make the name a bit clearer. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D136068	2022-11-16 22:12:08 +00:00
Alexey Bataev	9f9fdab9f1	[SLP]Fix PR58766: deleted value used after vectorization. If same instruction is reduced several times, but in one graph is part of buildvector sequence and in another it is vectorized, we may loose information that it was part of buildvector and must be extracted from later vectorized value.	2022-11-16 10:57:03 -08:00
Alexey Bataev	2f8f17c157	[SLP]Fix PR58956: fix insertpoint for reduced buildvector graphs. If the graph is only the buildvector node without main operation, need to inherit insrtpoint from the redution instruction. Otherwise the compiler crashes trying to insert instruction at the entry block.	2022-11-16 07:38:49 -08:00
Alexey Bataev	0a33ceee01	[SLP]Fix a crash on analysis of the vectorized node. Need to use advanced check for the same vectorized node to avoid possible compiler crash. We may have 2 similar nodes (vector one and gather) after graph nodes rotation, need to do extra checks for the exact match.	2022-11-15 13:40:28 -08:00
OCHyams	139e08efc5	[Assignment Tracking][23/*] Account for assignment tracking in SLP Vectorizer The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir The SLP-Vectorizer can merge a set of scalar stores into a single vectorized store. Merge DIAssignID intrinsics from the scalar stores onto the new vectorized store. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D133320	2022-11-15 15:20:18 +00:00
Jordan Rupprecht	81896f88ce	[NFC] Remove unused OrigLoopID vars	2022-11-11 07:51:40 -08:00
Florian Hahn	2d7e5e29b7	[LV] Remove unused OrigLoopID argument from completeLoopSekelton (NFC). The argument is not used any longer and can be removed.	2022-11-11 15:39:08 +00:00
Sanjay Patel	b57819e130	[VectorCombine] widen a load with subvector insert This adapts/copies code from the existing fold that allows widening of load scalar+insert. It can help in IR because it removes a shuffle, and the backend can already narrow loads if that is profitable in codegen. We might be able to consolidate more of the logic, but handling this basic pattern should be enough to make a small difference on one of the motivating examples from issue #17113. The final goal of combining loads on those patterns is not solved though. Differential Revision: https://reviews.llvm.org/D137341	2022-11-10 14:11:32 -05:00
Alexey Bataev	b505fd559d	[SLP]Redesign vectorization of the gather nodes. Gather nodes are vectorized as simply vector of the scalars instead of relying on the actual node. It leads to the fact that in some cases we may miss incorrect transformation (non-matching set of scalars is just ended as a gather node instead of possible vector/gather node). Better to rely on the actual nodes, it allows to improve stability and better detect missed cases. Differential Revision: https://reviews.llvm.org/D135174	2022-11-10 10:59:54 -08:00
Alexey Bataev	b5d91ab73e	[SLP]Fix PR58863: Mask index beyond mask size for non-power-2 insertelement analysis. Need to check if the insertelement mask size is reached during cost analysis to avoid compiler crash. Differential Revision: https://reviews.llvm.org/D137639	2022-11-08 07:54:57 -08:00
skc7	42bce72536	Reapply "[SLP] Extend reordering data of tree entry to support PHInodes". Reapplies `87a2086` (which was reverted in `656f1d8`). Fix for scalable vectors in getInsertIndex merged in `46d53f4`. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D137537	2022-11-08 21:21:28 +05:30
Nathan James	6aa050a690	Reland "[llvm][NFC] Use c++17 style variable type traits" This reverts commit `632a389f96`. This relands commit `1834a310d0`. Differential Revision: https://reviews.llvm.org/D137493	2022-11-08 14:15:15 +00:00
Nathan James	632a389f96	Revert "[llvm][NFC] Use c++17 style variable type traits" This reverts commit `1834a310d0`.	2022-11-08 13:11:41 +00:00
skc7	46d53f45d8	[SLP][NFC] Restructure getInsertIndex Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D137567	2022-11-08 18:07:50 +05:30
Nathan James	1834a310d0	[llvm][NFC] Use c++17 style variable type traits This was done as a test for D137302 and it makes sense to push these changes Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D137493	2022-11-08 12:22:52 +00:00
skc7	9d96feb19b	[SLP][NFC] Restructure areTwoInsertFromSameBuildVector Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D137569	2022-11-08 09:32:19 +05:30
Alexey Bataev	ecd0b5a532	Revert "[SLP]Redesign vectorization of the gather nodes." This reverts commit `8ddd1ccdf8` to fix buildbots failures reported in https://lab.llvm.org/buildbot#builders/74/builds/14839	2022-11-07 08:35:21 -08:00
Alexey Bataev	8ddd1ccdf8	[SLP]Redesign vectorization of the gather nodes. Gather nodes are vectorized as simply vector of the scalars instead of relying on the actual node. It leads to the fact that in some cases we may miss incorrect transformation (non-matching set of scalars is just ended as a gather node instead of possible vector/gather node). Better to rely on the actual nodes, it allows to improve stability and better detect missed cases. Differential Revision: https://reviews.llvm.org/D135174	2022-11-07 07:04:38 -08:00
David Green	656f1d8b74	Revert "[SLP] Extend reordering data of tree entry to support PHI nodes" This reverts commit `87a20868eb` as it has problems with scalable vectors and use-list orders. Test to follow.	2022-11-06 11:43:51 +00:00
Sanjay Patel	710e34e136	[VectorCombine] move load safety checks to helper function; NFC These checks can be re-used with other potential transforms such as a load of a subvector-insert.	2022-11-04 10:39:37 -04:00
LiDongjin	d1cee3539f	[LoopVectorize] Fix crash on "Cannot dereference end iterator!"(PR56627) Check hasOneUser before user_back(). Differential Revision: https://reviews.llvm.org/D136227	2022-11-03 23:13:37 +08:00
Alexey Bataev	f090e3c00f	[SLP]Fix write after bounds. Need to use comma instead of + symbol to prevent writing after bounds.	2022-11-03 05:30:41 -07:00
Alexey Bataev	8b015b2078	[SLP][NFC]Formatting and reduce number of iterations, NFC.	2022-11-03 05:30:13 -07:00
skc7	87a20868eb	[SLP] Extend reordering data of tree entry to support PHI nodes Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D136757	2022-11-01 04:50:04 +00:00
Alexey Bataev	99f9bd4807	[SLP]Fix a crash in the analysis of the compatible cmp operands. We can skip the analysis of the operands opcodes, can compare directly them in some cases.	2022-10-31 09:47:25 -07:00
Florian Hahn	43f0f1a66f	[VPlan] Use onlyFirstLaneUsed in sinkScalarOperands. Replace custom code to check if only the first lane is used by generic helper `onlyFirstLaneUsed`. This enables VPlan-based sinking in a few additional cases and was suggested in D133760. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D136368	2022-10-29 19:45:19 +01:00
Alexey Bataev	2ec51f1c75	[SLP]Improve analysis of same/alternate code ops and scheduling. Should improve compile time for analysis and vectorization. Metric: SLP.NumVectorInstructions Program SLP.NumVectorInstructions test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test 6380.00 6378.00 -0.0% test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test 6380.00 6378.00 -0.0% test-suite :: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test 2023.00 2022.00 -0.0% test-suite :: External/SPEC/CINT2006/471.omnetpp/471.omnetpp.test 148.00 146.00 -1.4% Generated more vector instructions. Differential Revision: https://reviews.llvm.org/D127531	2022-10-27 16:29:16 -07:00
Alexey Bataev	8ce0c7b1c9	Revert "[SLP]Improve analysis of same/alternate code ops and scheduling." This reverts commit `dad64448c6` to fix a crash in https://lab.llvm.org/buildbot/#/builders/74/builds/14584	2022-10-27 15:21:35 -07:00
Alexey Bataev	dad64448c6	[SLP]Improve analysis of same/alternate code ops and scheduling. Should improve compile time for analysis and vectorization. Metric: SLP.NumVectorInstructions Program SLP.NumVectorInstructions test-suite :: External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test 6380.00 6378.00 -0.0% test-suite :: External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test 6380.00 6378.00 -0.0% test-suite :: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test 2023.00 2022.00 -0.0% test-suite :: External/SPEC/CINT2006/471.omnetpp/471.omnetpp.test 148.00 146.00 -1.4% Generated more vector instructions. Differential Revision: https://reviews.llvm.org/D127531	2022-10-27 11:31:18 -07:00
Philip Reames	269bc684e7	[LV][RISCV] Disable vectorization of epilogue loops Epilogue loop vectorization is a feature in the vectorize intended to avoid running fully scalar code when the vector length of the main loop turns out to be either longer than the trip count of the actual loop, or with a huge remainder. In practice, this feature appears to not have been well tuned. I honestly don't think it should be on by default at all, but it definitely shouldn't be on for RISCV. Note that other targets have also disabled it, but they've done so via disabling interleaving - which is, well, completely unrelated - and we don't want to do that for RISCV. In the near term, many examples I'm seeing have terrible codegen for epilogue vectorization. We are greatly increasing code size for little value at reasonable VLEN values for small types. In the long term, the cases that epilogue vectorization are intended to handle are likely better handled via tail folding on RISCV. As an aside, I also don't really trust the correctness of epilogue vectorization. The code structure is such that otherwise straight forward changes sometimes break only epilogue vectorization. The reuse of an existing vplan without careful validation opens significant room for nasty bugs. Given how rarely the code is exercised, that is not a good combination. As such, this patch introduces a TTI hook, and completely disables epilogue vectorization on RISCV. Differential Revision: https://reviews.llvm.org/D136695	2022-10-25 14:28:02 -07:00
Alexey Bataev	da4e0f7ac5	[SLP][NFC]Fix PR58476: Fix compile time for reductions, NFC. Improve O(N^2) to O(N) in some cases, reduce number of allocations by reserving memory. Also, improve analysis of loads reduction values to avoid analysis of not vectorizable cases.	2022-10-24 10:13:24 -07:00
Florian Hahn	7eb4ec1c75	[VPlan] Print predicates for widened cmp instructions (NFC).	2022-10-21 08:54:11 +01:00
Paul Walker	ab8257ca0e	[NFC] Fix a few whitespace inconsistencies.	2022-10-20 14:52:25 +00:00
Florian Hahn	e25ed058bc	[LV] Use buildScalarSteps to also handle VF = 1. (NFCI) The code in buildScalarSteps already properly handles creating the scalar induction values with VF = 1. Use it directly instead of using extra code to handle that case. Suggested by @Ayal in D133760.	2022-10-20 14:30:01 +01:00
Alexey Bataev	b8b740c834	[SLP][NFC]Remove unused variable, NFC.	2022-10-19 12:35:27 -07:00
Florian Hahn	d72fcee8f4	[VPlan] Add VPValue::isDefinedOutsideVectorRegions helper (NFC). @Ayal suggested a better named helper than using `!getDef()` to check if a value is invariant across all parts. The property we are using here is that the VPValue is defined outside any vector loop region. There's a TODO left to handle recipes defined in pre-header blocks. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D133666	2022-10-19 13:20:30 +01:00
Alexey Bataev	087dadfd37	[SLP]Generalize cost model. Generalized the cost model estimation. Improved cost model estimation for repeated scalars (no need to count their cost anymore), improved cost model for extractelement instructions. cpu2017 511.povray_r 0.57 520.omnetpp_r -0.98 521.wrf_r -0.01 525.x264_r 3.59 <+ 526.blender_r -0.12 531.deepsjeng_r -0.07 538.imagick_r -1.42 Geometric mean: 0.21 Differential Revision: https://reviews.llvm.org/D115757	2022-10-18 11:55:59 -07:00
Alexey Bataev	62267e8de0	Revert "[SLP]Generalize cost model." This reverts commit `f12fb91188` and `f5c747bfbe` to fix detected non-initialized var use.	2022-10-18 11:25:59 -07:00
Alexey Bataev	f5c747bfbe	[SLP][NFC]Fix a warning for ?: with enum/unsigned, NFC.	2022-10-18 10:08:05 -07:00
Alexey Bataev	f12fb91188	[SLP]Generalize cost model. Generalized the cost model estimation. Improved cost model estimation for repeated scalars (no need to count their cost anymore), improved cost model for extractelement instructions. cpu2017 511.povray_r 0.57 520.omnetpp_r -0.98 521.wrf_r -0.01 525.x264_r 3.59 <+ 526.blender_r -0.12 531.deepsjeng_r -0.07 538.imagick_r -1.42 Geometric mean: 0.21 Differential Revision: https://reviews.llvm.org/D115757	2022-10-18 08:49:32 -07:00
Alexey Bataev	e79532d28c	[SLP][NFC]Try to fix MSVC buildbots with a workaround, NFC.	2022-10-18 07:50:10 -07:00
Alexey Bataev	6a6fc4890d	[SLP][NFC]Formatting of the getEntryCost function, NFC.	2022-10-18 07:18:26 -07:00
Sanjay Patel	8d76fbb5f0	[VectorCombine] fix crashing on match of non-canonical fneg We can't assume that operand 0 is the negated operand because the matcher handles "fsub -0.0, X" (and also +0.0 with FMF). By capturing the extract within the match, we avoid the bug and make the transform more robust (can't assume that this pass will only see canonical IR).	2022-10-17 10:47:48 -04:00
Kazu Hirata	b2f41e9ac1	[Vectorize] Use std::conditional_t (NFC)	2022-10-15 14:52:25 -07:00

1 2 3 4 5 ...

3453 Commits