llvm-project

Commit Graph

Author	SHA1	Message	Date
Chenbing Zheng	851447cb32	[InstCombine] remove useless insertelement extractelement (bitcast (insertelement (Vec, b)), a) -> extractelement (bitcast (Vec), a) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D128890	2022-07-06 17:05:27 +08:00
Kazu Hirata	129b531c9c	[llvm] Use value_or instead of getValueOr (NFC)	2022-06-18 23:07:11 -07:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Sanjay Patel	05527b68a0	[InstCombine] fold more shuffles with FP<->Int cast operands shuffle (cast X), (cast Y), Mask --> cast (shuffle X, Y, Mask) This extends the transform added with `0353c2c996`. If the shuffle reduces vector length, the transform reduces the width of the cast, so that should be a win for most codegen (if not, it can be inverted).	2022-05-24 15:11:38 -04:00
Sanjay Patel	dbf3b5f114	[InstCombine] fold more shuffles with FP<->Int cast operands shuffle (cast X), (cast Y), Mask --> cast (shuffle X, Y, Mask) This extends the transform added with `0353c2c996`. If the casts are to a larger element type, the transform reduces shuffle bit width, so that should be a win for most codegen (if not, it can be inverted).	2022-05-17 14:25:11 -04:00
Sanjay Patel	0353c2c996	[InstCombine] fold shuffles with FP<->Int cast operands shuffle (cast X), (cast Y), Mask --> cast (shuffle X, Y, Mask) This is similar to a recent transform with fneg ( `b331a7ebc1` ), but this is intentionally the most conservative first step to try to avoid regressions in codegen. There are several restrictions that could be removed as follow-up enhancements. Note that a cast with a unary shuffle is currently canonicalized in the other direction (shuffle after cast - D103038 ). We might want to invert that to be consistent with this patch.	2022-05-10 14:20:43 -04:00
Sanjay Patel	b331a7ebc1	[InstCombine] canonicalize fneg after shuffle For the unary shuffle pattern, this is opposite to what we try to do with binops, but it seems better to keep it consistent with the motivating binary shuffle pattern. On that, it is clearly better on the usual no-extra uses case. There is a chance that this will pull an fneg away from some other binop and cause a regression in codegen, but that should be invertible in the backend. The transform is birectional: https://alive2.llvm.org/ce/z/kKaKCU https://alive2.llvm.org/ce/z/3Desfw Fixes #45631	2022-05-06 16:30:26 -04:00
Sanjay Patel	5dbb53b1b4	[InstCombine] merge shuffled vector negate and multiply Add the "(0 - X) --> (X * -1)" reverse identity to the list of alternate form binops. We need a little hack to make the existing logic work because it does not expect to move constants from op0 to op1, but the code comment hopefully makes that clear. I don't think there are any other identities like that. Fixes #54364 Differential Revision: https://reviews.llvm.org/D122390	2022-03-24 10:25:16 -04:00
Sanjay Patel	ccf8c969c2	[InstCombine] reorder code, fix formatting; NFC The affected code can be updated to solve #54364, so make some cosmetic diffs before real changes.	2022-03-22 16:33:01 -04:00
serge-sans-paille	59630917d6	Cleanup includes: Transform/Scalar Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817	2022-03-03 07:56:34 +01:00
Nikita Popov	e6f31f4e51	[InstCombine] Use GEP type instead of pointee type The GEP source type is independent of whether it is a scalar or vector GEP, as such we can simply preserve it.	2021-12-28 14:57:43 +01:00
Sanjay Patel	3db974face	[InstCombine] convert static function to internal class function; NFC The transform can require an optional shuffle instruction to be sound, so we need to use Builder to create all values and then replace the original instruction with whatever that final value is.	2021-12-14 11:18:35 -05:00
Philip Reames	e6ad9ef4e7	[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transforms are inconsistent about which types we end up with based on visit order. I'm restricting this to constants as for non-constants, we'd have to decide whether the simplicity was worth extra instructions. For constants, there are no extra instructions. We chose the canonical type as i64 arbitrarily. We might consider changing this to something else in the future if we have cause. Differential Revision: https://reviews.llvm.org/D115387	2021-12-13 16:56:22 -08:00
Philip Reames	98f5ab6af3	[instcombine] Do demanded elts last when visiting extractelement This reorders existing transforms to put demanded elements last. The reasoning here is that when we have an example which can be scalarized or handled via demanded bits, we should prefer scalarization as that doesn't require dropping flags on arithmetic instructions. This doesn't show major changes in the tests today, but once I add support for fast math flags to dropPoisonGeneratingFlags this becomes glaringly obvious. Differential Revision: https://reviews.llvm.org/D115394	2021-12-09 10:04:49 -08:00
Philip Reames	56fa334333	[instcombine] A couple style tweaks to visitExtractElementInst [nfc]	2021-12-08 12:23:50 -08:00
Piotr Sobczak	03961709ed	[InstCombine] Extend pattern to replace shuffle's insertelement operand In D71220 a pattern was added to replace shuffle's insertelement operand if inserted scalar is not demanded. The pattern was added only for the case where the shuffle's mask size is equal to element's vector size. However, that condition is not required because the pattern does not change the shuffle vector size. This patch extends the pattern to also include cases where shuffle's mask size is not equal to element's vector size. Differential Revision: https://reviews.llvm.org/D112318	2021-11-03 09:43:04 +01:00
Sanjay Patel	2a3cc4d461	[Analysis] add utility function for unary shuffle mask creation This is NFC-intended for the callers. Posting in case there are other potential users that I missed. I would also use this from VectorCombine in a patch for: https://llvm.org/PR52178 ( D111901 ) Differential Revision: https://reviews.llvm.org/D111891	2021-10-18 09:00:39 -04:00
Sanjay Patel	d95ebef4b8	[InstCombine] ease use check for fold of bitcasted extractelt to trunc This helps with examples like: https://llvm.org/PR52057 ...but we need at least one more fold to fix that case.	2021-10-07 15:09:34 -04:00
Sanjay Patel	db231ebdb0	[InstCombine] fold fake vector extract to shift+trunc We already handle more complicated cases like: extelt (bitcast (inselt poison, X, 0)) --> trunc (lshr X) But we missed this simpler pattern: https://alive2.llvm.org/ce/z/D55h64 / https://alive2.llvm.org/ce/z/GKzzRq This is part of solving: https://llvm.org/PR52057 I made the transform depend on legal/desirable int type to avoid creating a shift of an illegal type (for example i128). I'm not sure if that restriction is actually necessary, but we can change that as a follow-up if the backend can deal with integer ops on too-wide illegal types. The pile of AVX512 test changes are all neutral AFAICT - the x86 backend seems to know how to turn that into the expected "kmov" instructions. Differential Revision: https://reviews.llvm.org/D111082	2021-10-06 08:12:05 -04:00
hyeongyu kim	10a5632550	[NFC][InstCombine] Fix inconsistent comments	2021-09-23 09:31:39 +09:00
hyeongyu kim	98e96663f6	[InstCombine] Update InstCombine to use poison instead of undef for shufflevector's placeholder (3/3) This patch is for fixing potential shufflevector-related bugs like D93818. As D93818, this patch change shufflevector's default placeholder to poison. To reduce risk, it was divided into several patches, and this patch is for InstCombineVectorOps. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D110230	2021-09-23 00:48:24 +09:00
Florian Hahn	e08a5dc86f	[InstCombine] Move InstCombineWorklist to Utils to allow reuse (NFC). InstCombine's worklist can be re-used by other passes like VectorCombine. Move it to llvm/Transform/Utils and rename it to InstructionWorklist. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D110181	2021-09-22 08:47:21 +01:00
Simon Pilgrim	fc8f1e4419	[InstCombine] foldConstantInsEltIntoShuffle - bail if we fail to find constant element (PR51824) If getAggregateElement() returns null for any element, early out as otherwise we will assert when creating a new constant vector Fixes PR51824 + ; OSS-Fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38057	2021-09-21 13:01:09 +01:00
Sanjay Patel	e5a32d720e	[InstCombine] move extend after insertelement if both operands are extended I was wondering how instcombine does on the examples in D109236, and we're missing a basic transform: inselt (ext X), (ext Y), Index --> ext (inselt X, Y, Index) https://alive2.llvm.org/ce/z/z2aBu9 Note that there are several possible extensions of this fold (see TODO comments). Differential Revision: https://reviews.llvm.org/D109537	2021-09-15 14:38:03 -04:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Sanjay Patel	d59ae12d58	[InstCombine] fix typo; NFC	2021-08-31 09:02:14 -04:00
David Sherwood	ce394161cb	[InstCombine] Add more complex folds for extractelement + stepvector I have updated cheapToScalarize to also consider the case when extracting lanes from a stepvector intrinsic. This required removing the existing 'bool IsConstantExtractIndex' and passing in the actual index as a Value instead. We do this because we need to know if the index is <= known minimum number of elements returned by the stepvector intrinsic. Effectively, when extracting lane X from a stepvector we know the value returned is also X. New tests added here: Transforms/InstCombine/vscale_extractelement.ll Differential Revision: https://reviews.llvm.org/D106358	2021-08-10 09:17:21 +01:00
Paul Walker	287d39dd5a	[NFC] Fix a few whitespace issues and typos.	2021-07-04 11:49:58 +01:00
Roman Lebedev	6cf6f6f65f	[NFC][InstCombine] foldAggregateConstructionIntoAggregateReuse(): cast to Instruction eagerly In all of these, the value must be an instruction for us to succeed anyway, so change it to maybe hopefully make further changes more straight-forward.	2021-06-29 13:29:18 +03:00
Caroline Concatto	1ad52105eb	[InstCombine] Add fold for extracting known elements from a stepvector This patch allows folding stepvector + extract to the lane when the lane is lower than the minimum size of the scalable vector. This fold is possible because lane X of a stepvector is also X! For instance, extracting element 3 of a <vscale x 4 x i64>stepvector is 3. Differential Revision: https://reviews.llvm.org/D103153	2021-06-10 13:36:57 +01:00
Juneyoung Lee	7161bb87c9	[InsCombine] Fix a few remaining vec transforms to use poison instead of undef This is a patch that replaces shufflevector and insertelement's placeholder value with poison. Underlying motivation is to fix the semantics of shufflevector with undef mask to return poison instead (D93818) The consensus has been made in the late 2020 via mailing list as well as the thread in https://bugs.llvm.org/show_bug.cgi?id=44185 . This patch is a simple syntactic change to the existing code, hence directly pushed as a commit.	2021-05-31 18:47:09 +09:00
David Sherwood	70d8365e33	Fix warning introduced by `9c766f4090`	2021-05-26 10:20:39 +01:00
David Sherwood	9c766f4090	[InstCombine] Fold extractelement + vector GEP with one use We sometimes see code like this: Case 1: %gep = getelementptr i32, i32* %a, <2 x i64> %splat %ext = extractelement <2 x i32> %gep, i32 0 or this: Case 2: %gep = getelementptr i32, <4 x i32> %a, i64 1 %ext = extractelement <4 x i32> %gep, i32 0 where there is only one use of the GEP. In such cases it makes sense to fold the two together such that we create a scalar GEP: Case 1: %ext = extractelement <2 x i64> %splat, i32 0 %gep = getelementptr i32, i32 %a, i64 %ext Case 2: %ext = extractelement <2 x i32> %a, i32 0 %gep = getelementptr i32, i32 %ext, i64 1 This may create further folding opportunities as a result, i.e. the extract of a splat vector can be completely eliminated. Also, even for the general case where the vector operand is not a splat it seems beneficial to create a scalar GEP and extract the scalar element from the operand. Therefore, in this patch I've assumed that a scalar GEP is always preferrable to a vector GEP and have added code to unconditionally fold the extract + GEP. I haven't added folds for the case when we have both a vector of pointers and a vector of indices, since this would require generating an additional extractelement operation. Tests have been added here: Transforms/InstCombine/gep-vector-indices.ll Differential Revision: https://reviews.llvm.org/D101900	2021-05-26 09:54:26 +01:00
Sanjay Patel	5577e86691	[InstCombine] fold extract subvector of bitcast insertelt This is visible in the original example from: https://llvm.org/PR50055 (but this change doesn't solve the bug) https://alive2.llvm.org/ce/z/vM_Yq-	2021-05-10 17:20:10 -04:00
Juneyoung Lee	1c10201d96	Update InstCombine to use undef matcher instead This is a patch to use m_Undef() matcher instead of isa<UndefValue>(). As suggested in D100122, this update is separately committed.	2021-04-18 11:05:36 +09:00
Sanne Wouda	05a6e2eb9a	[InstCombine] Add a combine for a shuffle of similar bitcasts Some intrinsics wrapper code has the habit of ignoring the type of the elements in vectors, thinking of vector registers as a "bag of bits". As a consequence, some operations are shared between vectors of different types are shared. For example, functions that rearrange elements in a vector can be shared between vectors of int32 and float. This can result in bitcasts in awkward places that prevent the backend from recognizing some instructions. For AArch64 in particular, it inhibits the selection of dup from a general purpose register (GPR), and mov from GPR to a vector lane. This patch adds a pattern in InstCombine to move the bitcasts past the shufflevector if this is possible. Sometimes this even allows InstCombine to remove the bitcast entirely, as in the included tests. Alternatively this could be done with a few extra patterns in the AArch64 backend, but InstCombine seems like a better place for this. Differential Revision: https://reviews.llvm.org/D97397	2021-03-08 16:32:30 +00:00
Sanne Wouda	5e963a2441	Rehome an orphaned comment [NFC] As seen in `35827164c4`, the "shuffle x, x, mask" comment has drifted away from the implementation of the pattern. Put it back.	2021-03-08 16:32:30 +00:00
Kazu Hirata	def7cfb7ff	[InstCombine] Use is_contained (NFC)	2020-11-21 15:47:11 -08:00
Kazu Hirata	43c0e4f665	[Transforms] Use llvm::is_contained (NFC)	2020-11-18 20:42:22 -08:00
Benjamin Kramer	7b782062b4	[InstCombine] Simplify code. NFCI.	2020-09-27 19:11:07 +02:00
Sanjay Patel	6bad3caeb0	[InstCombine] use unary shuffle creator to reduce code duplication; NFC	2020-09-21 15:34:24 -04:00
Simon Pilgrim	48b510c4bc	[NFC] Fix compiler warnings due to integer comparison of different signedness Fix by directly using INT_MAX and INT32_MAX. Patch by: @nullptr.cpp (Yang Fan) Differential Revision: https://reviews.llvm.org/D87347	2020-09-11 15:32:03 +01:00
Christopher Tetreault	640f20b0c7	[SVE] Remove calls to VectorType::getNumElements from InstCombine Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D82237	2020-08-31 12:59:10 -07:00
Nikita Popov	6093b14c2c	[InstCombine] Return replaceInstUsesWith() result (NFC) Follow the usual usage pattern for this function and return the result.	2020-08-29 14:49:57 +02:00
Roman Lebedev	71ac9105cd	[InstCombine] foldAggregateConstructionIntoAggregateReuse(): use InstCombiner::replaceInstUsesWith() instead of RAUW We really shouldn't use RAUW in InstCombine because we should consistently update Worklist to avoid extra iterations.	2020-08-29 15:10:14 +03:00
David Sherwood	f4257c5832	[SVE] Make ElementCount members private This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065	2020-08-28 14:43:53 +01:00
Roman Lebedev	2f01785857	[NFC][InstCombine] Aggregate reconstruction: use plain map Now that we no longer require for this map to have stable iteration order, we no longer need to pay for keeping the iteration order stable, so switch from `SmallMapVector` to `SmallDenseMap`.	2020-08-19 01:09:25 +03:00
Roman Lebedev	78bd4231bf	[InstCombine] PHI-aware aggregate reconstruction: properly handle duplicate predecessors While it may seem like we can just "deduplicate" the case where some basic block happens to be a predecessor more than once, which happens for e.g. switches, that is not correct thing to do. We must actually add a PHI operand for each predecessor. This was initially reported to me by David Major as a clang crash during gecko build for android.	2020-08-19 01:00:42 +03:00
Roman Lebedev	03127f795b	[InstCombine] PHI-aware aggregate reconstruction: correctly detect "use" basic block While the original implementation added in D85787 / `ae7f08812e` is not incorrect, it is known to be suboptimal. In particular, it is not incorrect to use the basic block in which the original `insertvalue` instruction is located as the merge point, that is not necessarily optimal, as `@test6` shows. We should look at all the AggElts, and, if they are all defined in the same basic block, then that is the basic block we should use. On RawSpeed library, this catches +4% (+50) more cases. On vanilla LLVM test-suits, this catches +12% (+92) more cases.	2020-08-18 00:45:18 +03:00
Roman Lebedev	f4f673e0e3	[NFC][InstCombine] PHI-aware aggregate reconstruction: don't capture UseBB in lambdas, take it as argument In a following patch, UseBB will be detected later, so capturing it is potentially error-prone (capture by ref vs by val). Also, parametrized UseBB will likely be needed for multiple levels of PHI indirections later on anyways.	2020-08-18 00:45:18 +03:00

1 2 3 4 5 ...

257 Commits