llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	d172842b51	[DAG] SimplifyDemandedVectorElts - adjust demanded elements for selection mask for known zero results If an element is known zero from both selections then it shouldn't matter what the selection mask element is.	2022-07-13 17:36:05 +01:00
Philip Reames	fd67992f9c	[DAGCombine] fold (urem x, (lshr pow2, y)) -> (and x, (add (lshr pow2, y), -1)) We have the same fold in InstCombine - though implemented via OrZero flag on isKnownToBePowerOfTwo. The reasoning here is that either a) the result of the lshr is a power-of-two, or b) we have a div-by-zero triggering UB which we can ignore. Differential Revision: https://reviews.llvm.org/D129606	2022-07-13 08:34:38 -07:00
Sanjay Patel	d0eec5f7e7	[SDAG] enhance sub->xor fold to ignore signbit As suggested in the post-commit feedback for D128123, we can ease the mask constraint to ignore the MSB (and make the code easier to read by adjusting the check). https://alive2.llvm.org/ce/z/bbvqWv	2022-07-11 12:37:50 -04:00
Kazu Hirata	1fd6611fc8	[SelectionDAG] Restore calls to has_value (NFC) This patch restores calls to has_value to make it clear that we are checking the presence of an optional value, not the underlying value. This patch partially reverts `d08f34b592`. Differential Revision: https://reviews.llvm.org/D129454	2022-07-10 14:37:23 -07:00
Craig Topper	40866b74bd	[DAGCombiner][X86] Fold sra (sub AddC, (shl X, N1C)), N1C --> sext (sub AddC1',(trunc X to (width - N1C))) We already handled this case for add with a constant RHS. A similar pattern can occur for sub with a constant left hand side. Test cases use add and a mul representing (neg (shl X, C)) because that's what I saw in the wild. The mul will be decomposed and then the new transform can kick in. Tests have not been committed, but this patch shows the changes. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D128769	2022-07-09 11:53:44 -07:00
Sanjay Patel	8b75671314	[SDAG] try to replace subtract-from-constant with xor This is almost the same as the abandoned D48529, but it allows splat vector constants too. This replaces the x86-specific code that was added with the alternate patch D48557 with the original generic combine. This transform is a less restricted form of an existing InstCombine and the proposed SDAG equivalent for that in D128080: https://alive2.llvm.org/ce/z/OUm6N_ Differential Revision: https://reviews.llvm.org/D128123	2022-07-08 08:14:24 -04:00
Simon Pilgrim	7068c843d2	[DAG] visitREM - use isAllOnesOrAllOnesSplat instead of isConstOrConstSplat We were only using the N1C scalar/splat value once, so for clarity use isAllOnesOrAllOnesSplat instead if we actually need it.	2022-07-05 16:44:31 +01:00
Simon Pilgrim	e7a0fa4df0	[DAG] foldAddSubOfSignBit - don't bother creating the new shift node unless constant folding succeeds Noticed by inspection - the new shift is only ever used if the constant fold occurs	2022-07-05 16:44:31 +01:00
Simon Pilgrim	cce64e7a9c	[DAG] visitTRUNCATE - move GetDemandedBits AFTER SimplifyDemandedBits. Another cleanup step before removing GetDemandedBits entirely.	2022-07-04 11:25:40 +01:00
Kazu Hirata	94460f5136	Don't use Optional::hasValue (NFC) This patch replaces x.hasValue() with x where x is contextually convertible to bool.	2022-06-26 19:54:41 -07:00
Kazu Hirata	d08f34b592	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-26 18:31:51 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit `aa8feeefd3`.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
chenglin.bi	8c74205642	[SelectionDAG][DAGCombiner] Reuse exist node by reassociate When already have (op N0, N2), reassociate (op (op N0, N1), N2) to (op (op N0, N2), N1) to reuse the exist (op N0, N2) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122539	2022-06-24 23:15:06 +08:00
chenglin.bi	9c2bf534f5	Revert "[SelectionDAG][DAGCombiner] Reuse exist node by reassociate" This reverts commit `6c951c5ee6`.	2022-06-23 13:21:51 +08:00
Simon Pilgrim	1c2b756cd6	[DAG] visitTRUNCATE - move TRUNCATE(ADDE/ADDCARRY) folds to switch statement handling the other binops. NFC.	2022-06-21 22:07:41 +01:00
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
chenglin.bi	6c951c5ee6	[SelectionDAG][DAGCombiner] Reuse exist node by reassociate When already have (op N0, N2), reassociate (op (op N0, N1), N2) to (op (op N0, N2), N1) to reuse the exist (op N0, N2) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D122539	2022-06-21 09:45:19 +08:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
Simon Pilgrim	e4a124dda5	[DAG] Fold (srl (shl x, c1), c2) -> and(shl/srl(x, c3), m) Similar to the existing (shl (srl x, c1), c2) fold Part of the work to fix the regressions in D77804 Differential Revision: https://reviews.llvm.org/D125836	2022-06-20 08:37:38 +01:00
Craig Topper	314dbde12c	[DAGCombiner][ARM][RISCV] Teach ShrinkLoadReplaceStoreWithStore to use truncstore. The VT we want to shrink to may not be legal especially after type legalization. Fixes PR56110. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D128135	2022-06-19 15:50:15 -07:00
Benjamin Kramer	8c4a07c61f	[DAGCombiner] Fold fold (fp_to_bf16 (bf16_to_fp op)) -> op	2022-06-15 19:54:39 +02:00
Simon Pilgrim	f096d5926d	[DAG] Fix SDLoc mismatch in (shl (srl x, c1), c2) -> and(shift(x,c3)) fold Noticed by @craig.topper on D125836 which uses a tweaked copy of the same code. Differential Revision: https://reviews.llvm.org/D127772	2022-06-15 11:07:59 +01:00
Simon Pilgrim	7d8fd4f5db	[DAG] visitINSERT_VECTOR_ELT - attempt to reconstruct BUILD_VECTOR before other fold interfere Another issue unearthed by D127115 We take a long time to canonicalize an insert_vector_elt chain before being able to convert it into a build_vector - even if they are already in ascending insertion order, we fold the nodes one at a time into the build_vector 'seed', leaving plenty of time for other folds to alter it (in particular recognising when they come from extract_vector_elt resulting in a shuffle_vector that is much harder to fold with). D127115 makes this particularly difficult as we're almost guaranteed to have the lost the sequence before all possible insertions have been folded. This patch proposes to begin at the last insertion and attempt to collect all the (oneuse) insertions right away and create the build_vector before its too late. Differential Revision: https://reviews.llvm.org/D127595	2022-06-13 11:48:18 +01:00
Simon Pilgrim	54ae4ca755	[DAG] visitSRL - pull out ShiftVT. NFC.	2022-06-12 14:02:23 +01:00
Simon Pilgrim	cf5c63d187	[DAG] visitVECTOR_SHUFFLE - fold splat(insert_vector_elt()) and splat(scalar_to_vector()) to build_vector splats Addresses a number of regressions identified in D127115	2022-06-11 21:06:42 +01:00
Simon Pilgrim	44a0cd25df	[DAG] visitINSERT_VECTOR_ELT - add <1 x ???> insert_vector_elt(v0,extract_vector_elt(v1,0),0) special case handling Check if we're just replacing one v1x?? vector with another	2022-06-11 19:30:00 +01:00
Simon Pilgrim	a71ad6a3c8	[DAG] visitINSERT_VECTOR_ELT - fold insert_vector_elt(scalar_to_vector(x),v,i) -> build_vector() Allow scalar_to_vector nodes to be used for the start of a build_vector creation	2022-06-11 15:29:22 +01:00
Simon Pilgrim	693f4db1ec	[DAG] visitINSERT_VECTOR_ELT - refactor BUILD_VECTOR insertion to remove early-out. NFCI. Remove the early-out cases so we can more easily add additional folds in the future.	2022-06-11 12:01:13 +01:00
Simon Pilgrim	7dbfcfa735	[DAG] combineInsertEltToShuffle - if EXTRACT_VECTOR_ELT fails to match an existing shuffle op, try to replace an undef op if there is one. This should fix a number of shuffle regressions in D127115 where the re-ordered combines mean we fail to fold a EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT sequence into a BUILD_VECTOR if we extract from more than one vector source.	2022-06-09 14:56:14 +01:00
Simon Pilgrim	b84c10d4bc	[DAG] visitVSELECT - don't wait for truncation of sub before attempting to match with getTruncatedUSUBSAT Fixes some X86 PSUBUS regressions encountered in D127115 where the truncate was being replaced with a PACKSS/PACKUS before the fold got called again	2022-06-08 16:16:35 +01:00
Simon Pilgrim	a083f3caa1	[DAG] combineShuffleOfSplatVal - fold shuffle(splat,undef) -> splat, iff the splat contains no UNDEF elements As noticed on D127115 - we were missing this fold, instead just having the shuffle(shuffle(x,undef,splatmask),undef) fold. We should be able to merge these into one using SelectionDAG::isSplatValue, but we'll need to match the shuffle's undef handling first. This also exposed an issue in SelectionDAG::isSplatValue which was incorrectly propagating the undef mask across a bitcast (it was trying to just bail with a APInt::isSubsetOf if it found any undefs but that was actually the wrong way around so didn't fire for partial undef cases).	2022-06-07 16:42:24 +01:00
Guillaume Chatelet	0788186182	[Alignment][NFC] Remove usage of MemSDNode::getAlignment I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times. Differential Revision: https://reviews.llvm.org/D126910	2022-06-07 13:52:20 +00:00
Nikita Popov	5a64bc207e	[DAGCombiner] Remove overzealous assertion when folding assert+trunc+assert (PR55846) These assert that there are no "useless" assertzext/assertsext nodes (that assert a wider width than a following trunc), but I don't think there is anything preventing such nodes from reaching this code. I don't think the assertion is relevant for correctness of this transform either -- if such an assert is present, then the other one will always be to a smaller width, and we'll pick that one. The assertion dates back to D37017. Fixes https://github.com/llvm/llvm-project/issues/55846. Differential Revision: https://reviews.llvm.org/D126952	2022-06-07 09:50:26 +02:00
Benjamin Kramer	e8e4b741dd	[DAGCombiner] Add bf16 to the matrix of types that we don't promote to integer stores Remove a few stray semicolons while there.	2022-06-03 13:28:34 +02:00
Nikita Popov	ad742cf85d	[DAGCombine] Handle promotion of shift with both operands the same When promoting a shift, make sure we only fetch the second operand after promoting the first. Load promotion may replace users of the old load, and we don't want to be left with a dangling reference to the old load instruction. The crashing test case is from https://reviews.llvm.org/D126689#3553212. Differential Revision: https://reviews.llvm.org/D126886	2022-06-03 10:00:44 +02:00
Ping Deng	ae8ae45e2a	[DAGCombine][NFC] Add braces to 'else' to match braced 'if' Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D126624	2022-06-01 07:54:05 +00:00
Simon Pilgrim	f366acdbf6	[DAG] Generalize (sra (trunc (sra x, c1)), c2) -> (trunc (sra x, c1 + c2)) constant folding Remove local (uniform) constant folding and rely on getNode() to perform it Minor cleanup step toward adding non-uniform shift amount support	2022-05-26 14:05:09 +01:00
Simon Pilgrim	7b617eef80	[DAG] Cleanup "and/or of cmp with single bit diff" fold to use ISD::matchBinaryPredicate Prep work as I'm investigating some cases where TLI::convertSetCCLogicToBitwiseLogic should accept vectors.	2022-05-26 12:34:09 +01:00
Craig Topper	569d8945f3	[DAGCombiner][AArch64] Don't fold (smulo x, 2) -> (saddo x, x) if VT is i2. If the VT is i2, then 2 is really -2. Test has not been commited yet, but diff shows the change. Fixes PR55644. Differential Revision: https://reviews.llvm.org/D126213	2022-05-23 11:13:57 -07:00
Paul Walker	258dac43d6	[SVE] Enable use of 32bit gather/scatter indices for fixed length vectors Differential Revision: https://reviews.llvm.org/D125193	2022-05-22 12:32:30 +01:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Craig Topper	46eef76876	[DAGCombiner] Fix bug in MatchBSwapHWordLow. This function tries to match (a >> 8) \| (a << 8) as (bswap a) >> 16. If the SRL isn't masked and the high bits aren't demanded, we still need to ensure that bits 23:16 are zero. After the right shift they will be in bits 15:8 which is where the important bits from the SHL end up. It's only a bswap if the OR on bits 15:8 only takes the bits from the SHL. Fixes PR55484. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D125641	2022-05-18 09:23:18 -07:00
Simon Pilgrim	d40b7f0d5a	[DAG] Fold (shl (srl x, c), c) -> and(x, m) even if srl has other uses If we're using shift pairs to mask, then relax the one use limit if the shift amounts are equal - we'll only be generating a single AND node. AArch64 has a couple of regressions due to this, so I've enforced the existing one use limit inside a AArch64TargetLowering::shouldFoldConstantShiftPairToMask callback. Part of the work to fix the regressions in D77804 Differential Revision: https://reviews.llvm.org/D125607	2022-05-17 13:40:11 +01:00
Paul Walker	7dd05ba9ed	[SelectionDAG] Remove duplicate "is scaled" information from gather/scatter SDNodes. During early gather/scatter enablement two different approaches were taken to represent scaled indices: * A Scale operand whereby byte_offsets = Index * Scale * An IndexType whereby byte_offsets = Index * sizeof(MemVT.ElementType) Having multiple representations is bad as shown by this patch which fixes instances where the two are out of sync. The dedicated scale operand is more flexible and pervasive so this patch removes the UNSCALED values from IndexType. This means all indices are scaled but the scale can be one, hence unscaled. SDNodes now use the scale operand to answer the "isScaledIndex" question. I toyed with the idea of keeping the UNSCALED enums and helper functions but because they will have no uses and force SDNodes to validate the set of supported values I figured it's best to remove them. We can re-add them if there's a real need. For similar reasons I've kept the IndexType enum when a bool could be used as I think being explicitly looks better. Depends On D123347 Differential Revision: https://reviews.llvm.org/D123381	2022-05-16 20:47:52 +01:00
Craig Topper	e6fc8454be	[DAGCombiner] Fix incorrect indentation. NFC	2022-05-16 09:27:15 -07:00
Bradley Smith	7ff5148d64	[DAGCombine] Support splat_vector nodes in (and (extload)) dagcombine Differential Revision: https://reviews.llvm.org/D125367	2022-05-16 11:25:20 +00:00
Simon Pilgrim	f4eac6e5f6	[DAG] visitOR - merge isa/cast<ShuffleVectorSDNode> into dyn_cast<ShuffleVectorSDNode>. NFC. Also, initialize entire mask to -1 to simplify undefined cases.	2022-05-14 20:49:26 +01:00
Simon Pilgrim	95cdd63b87	[DAG] visitADDLike - use SelectionDAG::FoldConstantArithmetic directly to match constant operands SelectionDAG::FoldConstantArithmetic determines if operands are foldable constants, so we don't need to bother with isConstantOrConstantVector / Opaque tests before calling it directly.	2022-05-14 18:39:41 +01:00
Simon Pilgrim	8db72d9d04	[DAG] visitMUL - pull out repeated SDLoc() calls. NFC.	2022-05-14 14:28:39 +01:00
Simon Pilgrim	8d4d4988e4	[DAG] Use SelectionDAG::FoldConstantArithmetic directly to match constant operands SelectionDAG::FoldConstantArithmetic determines if operands are foldable constants, so we don't need to bother with isConstantOrConstantVector / Opaque tests before calling it directly.	2022-05-14 14:19:12 +01:00
Simon Pilgrim	3fc33ced10	DAGCombiner.cpp - break if-else chains that always return (style)	2022-05-13 18:31:39 +01:00
Sanjay Patel	e52e1dab2a	[SDAG] freeze operand when expanging urem This is a potential miscompile as discussed in issue #55291. The related IR transform was patched with: `d428f09b2c`	2022-05-13 10:55:14 -04:00
David Green	2cfb243bcd	[DAG] Use isAnyConstantBuildVector. NFC As suggested from `02f8519502`, this uses the isAnyConstantBuildVector method in lieu of separate isBuildVectorOfConstantSDNodes calls. It should otherwise be an NFC.	2022-05-09 14:13:03 +01:00
David Green	02f8519502	[DAG] Prevent infinite loop combining bitcast shuffle This prevents an infinite loop from D123801, where code trying to reduce the total number of bitcasts, but also handling constants, could create the opposite transform. Prevent the transform in these case to let the bitcast of a constant transform naturally. Fixes #55345	2022-05-09 09:36:22 +01:00
Simon Pilgrim	800d36cf32	[DAG] Only perform the fold (A-B)+(C-D) --> (A+C)-(B+D) when both inner subs have one use Fixes #51381	2022-05-08 13:51:58 +01:00
Amaury Séchet	06fad8bc05	[DAGCombine] Add node in the worklist in topological order in CombineTo This is part of an ongoing effort toward making DAGCombine process the nodes in topological order. This is able to discover a couple of new optimizations, but also causes a couple of regression. I nevertheless chose to submit this patch for review as to start the discussion with people working on the backend so we can find a good way forward. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D124743	2022-05-07 16:24:31 +00:00
Paul Walker	702c4ade22	[ISD::IndexType] Helper functions for common queries. Add helper functions to query the signed and scaled properties of ISD::IndexType along with functions to change them. Remove setIndexType from MaskedGatherSDNode because it only has one usage and typically should only be changed alongside its index operand. Minimise the direct use of the enum values to lay the groundwork for more refactoring. Differential Revision: https://reviews.llvm.org/D123347	2022-05-07 11:23:42 +01:00
David Green	5930691ee1	Revert "[DAGCombine] Make combineShuffleOfBitcast LittleEndian specific" This reverts commit `891c3cf99e` as it turns out that the error was not caused by this commit, the error caming from D124526 instead.	2022-05-06 21:03:22 +01:00
David Green	891c3cf99e	[DAGCombine] Make combineShuffleOfBitcast LittleEndian specific Something is going wrong with the BigEndian PowerPC bot. It is hard to tell what is wrong from here, but attempt to fix it by disabling the combineShuffleOfBitcast combine for bigendian.	2022-05-06 18:42:44 +01:00
Simon Pilgrim	c0bebc12f0	[DAG] visitREM - merge buildOptimizedSREM into if(). NFCI.	2022-05-06 15:39:17 +01:00
David Green	115c188807	[DAG][PowerPC] Combine shuffle(bitcast(X), Mask) to bitcast(shuffle(X, Mask')) If the mask is made up of elements that form a mask in the higher type we can convert shuffle(bitcast into the bitcast type, simplifying the instruction sequence. A v4i32 2,3,0,1 for example can be treated as a 1,0 v2i64 shuffle. This helps clean up some of the AArch64 concat load combines, along with helping simplify a number of other tests. The PowerPC combine for v16i8 splat vector loads needed some fixes to keep it working for v16i8 vectors. This improves the handling of v2i64 shuffles to match too, hopefully improving them in general. Differential Revision: https://reviews.llvm.org/D123801	2022-05-06 10:50:31 +01:00
Craig Topper	4e2d1a6c18	[DAGCombiner] Fold (sext/zext undef) -> 0 and aext(undef) -> undef. Differential Revision: https://reviews.llvm.org/D124988	2022-05-05 09:34:18 -07:00
Craig Topper	fd13192aa5	[DAGCombiner] Fold (max/min X, X) -> X. Differential Revision: https://reviews.llvm.org/D124951	2022-05-05 09:34:17 -07:00
Nikita Popov	9678936f18	[DAGCombine] Fold (X & ~Y) \| Y with truncated not This extends the (X & ~Y) \| Y to X \| Y fold to also work if ~Y is a truncated not (when taking into account the mask X). This is done by exporting the infrastructure added in D124856 and reusing it here. I've retained the old value of AllowUndefs=false, though probably this can be switched to true with extra test coverage. Differential Revision: https://reviews.llvm.org/D124930	2022-05-05 11:10:11 +02:00
Simon Pilgrim	faa35fc873	[DAG] Fix issue with rot(rot(x,c1),c2) -> rot(x,c1+c2) fold with unnormalized rotation amounts Don't assume the rotation amounts have been correctly normalized - do it as part of the constant folding. Also, the normalization should be performed with UREM not SREM.	2022-05-03 17:16:26 +01:00
Craig Topper	5f057eaa0d	[DAGCombiner] reassociationCanBreakAddressingModePattern should check uses of the outer add. When looking for memory uses, reassociationCanBreakAddressingModePattern should check uses of the outer ADD rather than the inner ADD. We want to know if the two ops we're reassociating are used by a load/store. In practice, the existing check usually works because CodeGenPrepare will make one of the load/stores have an offset of 0 relative to split GEP. That will make the inner add have a memory use. To test this, I've manually split the GEPs so there is no 0 offset store. This issue was recently discussed in the original review D60294. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124644	2022-05-02 16:38:53 -07:00
Sanjay Patel	747c6a0c73	[SDAG] fix miscompile when casting int->FP->int This is the codegen equivalent of D124692. As shown in https://github.com/llvm/llvm-project/issues/55150 - the existing fold may be wrong when converting to a signed value. This is a quick fix to avoid the miscompile. https://alive2.llvm.org/ce/z/KtaDmd Differential Revision: https://reviews.llvm.org/D124771	2022-05-02 14:57:27 -04:00
Simon Pilgrim	ae8b10e543	[DAG] (style) Break apart if-else chain as they all return	2022-05-01 17:56:59 +01:00
Craig Topper	6affe87bda	[DAGCombiner] When matching a disguised rotate by constant don't forget to apply LHSMask/RHSMask. We try to match as a disguised rotate by constant of these forms (shl (X \| Y), C1) \| (srl X, C2) --> (rotl X, C1) \| (shl Y, C1) (shl X, C1) \| (srl (X \| Y), C2) --> (rotl X, C1) \| (srl Y, C2) We may have also looked through an AND to find the shift. If we did, we need to apply a mask to the result. I'll add an AArch64 test and pre-commit it and the RISC-V test tomorrow. Fixes PR55201. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D124711	2022-04-30 11:02:30 -07:00
Paul Walker	23c509754d	[DAGCombiner] Stop invalid sign conversion in refineIndexType. When looking through extends of gather/scatter indices it's safe to convert a known positive signed index to unsigned, but unsigned indices must remain unsigned. Depends On D123318 Differential Revision: https://reviews.llvm.org/D123326	2022-04-29 14:20:13 +01:00
Paul Walker	7a0b897e86	[DAGCombiner][SVE] Ensure MGATHER/MSCATTER addressing mode combines preserve index scaling refineUniformBase and selectGatherScatterAddrMode both attempt the transformation: base(0) + index(A+splat(B)) => base(B) + index(A) However, this is only safe when index is not implicitly scaled. Differential Revision: https://reviews.llvm.org/D123222	2022-04-29 12:35:16 +01:00
Simon Pilgrim	34e7243464	[DAG] Fold freeze(bitcast(x)) -> bitcast(freeze(x)) This is a very specific fold to fix an upstream poor codegen issue. InstCombine has the much more flexible pushFreezeToPreventPoisonFromPropagating but I don't think we're quite there with DAG/TLI handling for canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison value tracking yet. Fixes #54911 Differential Revision: https://reviews.llvm.org/D124185	2022-04-22 16:39:25 +01:00
Alexey Bataev	2cca53c815	[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer. We can process the long shuffles (working across several actual vector registers) in the best way if we take the actual register represantion into account. We can build more correct representation of register shuffles, improve number of recognised buildvector sequences. Also, same function can be used to improve the cost model for the shuffles. in future patches. Part of D100486 Differential Revision: https://reviews.llvm.org/D115653	2022-04-20 09:37:16 -07:00
Alexey Bataev	5f7ac15912	Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer." This reverts commit `2f49163b33` to fix a buildbot failure. Reported in https://lab.llvm.org/buildbot#builders/105/builds/24284	2022-04-20 06:35:55 -07:00
Alexey Bataev	2f49163b33	[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer. We can process the long shuffles (working across several actual vector registers) in the best way if we take the actual register represantion into account. We can build more correct representation of register shuffles, improve number of recognised buildvector sequences. Also, same function can be used to improve the cost model for the shuffles. in future patches. Part of D100486 Differential Revision: https://reviews.llvm.org/D115653	2022-04-20 05:32:56 -07:00
chenglin.bi	222adf338a	[Arch64][SelectionDAG] Add target-specific implementation of srem 1. X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first. 2. Add AArch64 faster path for SREM only pow2 case. Fix https://github.com/llvm/llvm-project/issues/54649 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122968	2022-04-19 02:49:42 +08:00
chenglin.bi	acfc025a72	Revert "[Arch64][SelectionDAG] Add target-specific implementation of srem" This reverts commit `9d9eddd3dd`.	2022-04-18 10:35:09 +08:00
chenglin.bi	9d9eddd3dd	[Arch64][SelectionDAG] Add target-specific implementation of srem X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first. Add AArch64 faster path for SREM only pow2 case. Fix https://github.com/llvm/llvm-project/issues/54649 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122968	2022-04-16 12:29:11 +08:00
Craig Topper	c6dc229a6d	[DAGCombiner] Move call to hasOneUse after opcode checks. NFC Checking the opcode is cheap, counting the number of uses is not.	2022-04-15 17:02:16 -07:00
Craig Topper	a7b9d75e7a	[DAGCombiner] Move or/xor/and opcode check in ReduceLoadOpStoreWidth before hasOneUse check. hasOneUse is not cheap on nodes with chain results that might have many uses. By checking the opcode first, we can avoid a costly walk of the use list on nodes we aren't interested in. Found by investigating calls to hasNUsesOfValue from the example provided in D123857.	2022-04-15 16:38:27 -07:00
Simon Pilgrim	fef221bf1f	[DAG] Enable SimplifyVBinOp folds on add/sub sat intrinsics	2022-04-13 12:53:23 +01:00
Simon Pilgrim	cfb3ee2185	[DAG] Add non-uniform vector support to (shl (srl x, c1), c2) -> (and (shift x, c3)) Another part of D77804 yak shaving Differential Revision: https://reviews.llvm.org/D123523	2022-04-13 11:37:33 +01:00
Simon Pilgrim	bc32a1dd76	[DAG] Add non-uniform vector support to (shl (sr[la] exact X, C1), C2) folds	2022-04-12 12:57:56 +01:00
Craig Topper	35be4a7af3	[SelectionDAG] Remove unecessary null check after call to getNode. NFC As far as I know getNode will never return a null SDValue. I'm guessing this was modeled after the FoldConstantArithmetic call earlier. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D123550	2022-04-11 18:03:44 -07:00
Craig Topper	5b5f59428c	[DAGCombiner] Replace call getSExtOrTrunc with a truncate. NFC The extend case should never occur. The sign extend would be an arbitrary choice, remove it to avoid confusion.	2022-04-06 09:59:45 -07:00
Paul Walker	7d3af9ef0f	[DAGCombine] insert_subvector undef, (splat X), N2 -> splat X Differential Revision: https://reviews.llvm.org/D120328	2022-04-06 17:15:38 +01:00
zhongyunde	19e5235147	[AArch64][InstCombine] Fold MLOAD and zero extensions into MLOAD Accord the discussion in D122281, we missing an ISD::AND combine for MLOAD because it relies on BuildVectorSDNode is fails for scalable vectors. This patch is intend to handle that, so we can circle back the type MVT::nxv2i32 Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D122703	2022-04-06 20:50:42 +08:00
Simon Pilgrim	3369e474bb	[DAG] Allow XOR(X,MIN_SIGNED_VALUE) to perform AddLike folds As raised on PR52267, XOR(X,MIN_SIGNED_VALUE) can be treated as ADD(X,MIN_SIGNED_VALUE), so let these cases use the 'AddLike' folds, similar to how we perform no-common-bits OR(X,Y) cases. define i8 @src(i8 %x) { %r = xor i8 %x, 128 ret i8 %r } => define i8 @tgt(i8 %x) { %r = add i8 %x, 128 ret i8 %r } Transformation seems to be correct! https://alive2.llvm.org/ce/z/qV46E2 Differential Revision: https://reviews.llvm.org/D122754	2022-04-06 10:37:11 +01:00
Sanjay Patel	e18cc5277f	[SDAG] try to canonicalize logical shift after bswap When shifting by a byte-multiple: bswap (shl X, C) --> lshr (bswap X), C bswap (lshr X, C) --> shl (bswap X), C This is the backend version of D122010 and an alternative suggested in D120648. There's an extra check to make sure the shift amount is valid that was not in the rough draft. I'm not sure if there is a larger motivating case for RISCV (bug report?), but the ARM diffs show a benefit from having a late version of the transform (because we do not combine the loads in IR). Differential Revision: https://reviews.llvm.org/D122655	2022-03-30 09:29:32 -04:00
Craig Topper	e68257fcee	[RISCV][SelectionDAG] Enable TargetLowering::hasBitTest for masks that fit in ANDI. Modified DAGCombiner to pass the shift the bittest input and the shift amount to hasBitTest. This matches the other call to hasBitTest in TargetLowering.h This is an alternative to D122454. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D122458	2022-03-28 12:46:36 -07:00
Simon Pilgrim	e209190c2d	[SDAG] enable binop identity constant folds for multiplies Add mul to the list of ops that we canonicalize with a select to expose an identity merge Differential Revision: https://reviews.llvm.org/D122071	2022-03-25 11:07:04 +00:00
zhongyunde	828b89bc0b	[AArch64][SelectionDAG] Supports unpklo/hi instructions to reduce the number of loads Trying to reduce the number of masked loads in favour of more unpklo/hi instructions. Both ISD::ZEXTLOAD and ISD::SEXTLOAD are supported to extensions from legal types. Both of normal and masked loads test cases added to guard compile crash. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D120953	2022-03-21 23:47:33 +08:00
Simon Pilgrim	35a7be6ccb	[SDAG] enable binop identity constant folds for shifts Add shl/srl/sra to the list of ops that we canonicalize with a select to expose an identity merge Differential Revision: https://reviews.llvm.org/D122070	2022-03-21 13:02:50 +00:00
Luo, Yuanke	10bb623192	enable binop identity constant folds for add Differential Revision: https://reviews.llvm.org/D119654	2022-03-20 19:07:16 +08:00
Craig Topper	ad94dfb9a0	[DAGCombiner][RISCV] Adjust (aext (and (trunc x), cst)) -> (and x, cst) to sext cst based on target preference RISCV strong prefers i32 values be sign extended to i64. This combine was always zero extending the constant using APInt methods. This adjusts the code so that it calls getNode using ISD::ANY_EXTEND instead. getNode will call TLI.isSExtCheaperThanZExt to decide how to handle the constant. Tests were copied from D121598 where I noticed that we were creating constants that were hard to materialize. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D121650	2022-03-15 08:26:47 -07:00
Sanjay Patel	c2592c374e	[SDAG] simplify bitwise logic with repeated operand We do not have general reassociation here (and probably do not need it), but I noticed these were missing in patches/tests motivated by D111530, so we can at least handle the simplest patterns. The VE test diff looks correct, but we miss that pattern in IR currently: https://alive2.llvm.org/ce/z/u66_PM	2022-03-13 11:12:30 -04:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
Sanjay Patel	341623653d	[SDAG] match rotate pattern with extra 'or' operation This is another fold generalized from D111530. We can find a common source for a rotate operation hidden inside an 'or': https://alive2.llvm.org/ce/z/9pV8hn Deciding when this is profitable vs. a funnel-shift is tricky, but this does not show any regressions: if a target has a rotate but it does not have a funnel-shift, then try to form the rotate here. That is why we don't have x86 test diffs for the scalar tests that are duplicated from AArch64 ( `74a65e3834` ) - shld/shrd are available. That also makes it difficult to show vector diffs - the only case where I found a diff was on x86 AVX512 or XOP with i64 elements. There's an additional check for a legal type to avoid a problem seen with x86-32 where we form a 64-bit rotate but then it gets split inefficiently. We might avoid that by adding more rotate folds, but I didn't check to see what is missing on that path. This gets most of the motivating patterns for AArch64 / ARM that are in D111530. We still need a couple of enhancements to setcc pattern matching with rotate/funnel-shift to get the rest. Differential Revision: https://reviews.llvm.org/D120933	2022-03-09 13:19:00 -05:00
David Green	4388f4f776	[DAG] Don't convert undef to 0 when creating buildvector When inserting undef into buildvectors created from shuffles of buildvectors, we convert elements to the largest needed type. This had the effect of converting undef into 0, which isn't needed as the buildvector implicitly truncates and trunc(zext(undef)) == undef. Differential Revision: https://reviews.llvm.org/D121002	2022-03-06 18:35:34 +00:00

1 2 3 4 5 ...

3354 Commits